65
Lecture 1 on Hash Functions July 26, 2011 Lecture 1: Hash Functions: Short Identifiers for Large Files We introduce a problem of giving unique identifiers to (potentially) large files and study how short such identifiers can be. Hash functions are de- signed for giving standard ways of solving the problem.

Hash Functions: lecture series by Ahto Buldas

Embed Size (px)

DESCRIPTION

This course is intended to give an overview of the nature of hash functions, their cryptographic and security properties and time-stamping as a practical usage for hash functions. The first part of the lecture series gives an overview of what hash functions are. In the second part, we take a look at the cryptographic security requirements for hash functions. The third part of the series deals with the matter of security properties of hash functions.In the fourth part, we explore hash functions which are provably secure but inefficient and hash functions which can be used practically. The fifth part in the series shows a practical application of hash functions on the example of time-stamping. In the sixth and seventh part of the lectures, we look at security requirements for hash functions used in time-stamping and ways of proving how specific hash functions meet requirements for hash functions.Lecturer Ahto Buldas is a professor at Tallinn University of Technology and University of Tartu, specializing in complexity theory, combinatorics, cryptography and data security. He has worked extensively with hash functions and their usage in timestamping and together with Margus Niitsoo in 2010 showed how global scale time-stamping can be used with 256-bit hash functions.

Citation preview

Page 1: Hash Functions: lecture series by Ahto Buldas

Lecture 1 on Hash Functions July 26, 2011

Lecture 1:Hash Functions:

Short Identifiers for Large Files

We introduce a problem of giving unique identifiers to (potentially) largefiles and study how short such identifiers can be. Hash functions are de-signed for giving standard ways of solving the problem.

Page 2: Hash Functions: lecture series by Ahto Buldas

Lecture 1 on Hash Functions July 26, 2011

Hash Functions

A hash function converts a large, possibly variable-sized amount of datainto a small datum (hash) that may serve as an index to an array.

Hash functions are mostly used to accelerate table lookup or data compar-ison tasks—such as finding items in a database, detecting duplicated orsimilar records in a large file, finding similar stretches in DNA sequences,etc.

With a hash function in use, we can check the presence of an item in adatabase with just one single lookup!

The idea was proposed in the 1950s, but the design of good hash functionsis still a topic of active research.

2

Page 3: Hash Functions: lecture series by Ahto Buldas

Lecture 1 on Hash Functions July 26, 2011

Collisions

A hash function may map two or more keys to the same hash value—thisis what we call a collision. In most applications, it is desirable to minimizethe occurrence of collisions, which means that the hash function must mapthe keys to the hash values as evenly as possible.

3

Page 4: Hash Functions: lecture series by Ahto Buldas

Lecture 1 on Hash Functions July 26, 2011

Birthday Paradox

Birthday paradox: For randomly chosen 25 people, very likely we have twowith the same birthday.

NB! This does NOT mean that if you are among those people, you will verylikely find someone with the same birthday!

4

Page 5: Hash Functions: lecture series by Ahto Buldas

Lecture 1 on Hash Functions July 26, 2011

How many different hash values do we need?

How large hash values should we use for uniquely identifying N items?

Answer: for a good hash function, we need aboutN2 different hash values.

If N = 2256 documents will be created all over the world during the ex-istence of mankind, we need about 2 · 256 = 512 output bits to identifythem uniquely.

Eddington number : number of protons in the observable universe is about

1.57 · 1079 ≈ 136 · 2256 .

This is still just a small fraction of all possible 40 byte sequences!

5

Page 6: Hash Functions: lecture series by Ahto Buldas

Lecture 1 on Hash Functions July 26, 2011

Modeling Hash Functions With Random Oracles

The values of a good hash function are distributed as evenly as possible.

A good hash function is modeled as a random function (a random oracle)—the computation of hash value is replaced with a uniformly random choice(all hash values occur with the same probability 1/M ).

6

Page 7: Hash Functions: lecture series by Ahto Buldas

Lecture 1 on Hash Functions July 26, 2011

Bounds for Collision Probability

Say we have N files and the hash function has k output bits, i.e. thereare M = 2k possible hash values. If the hash function is modeled as arandom oracle, the probability P (N,M) that all N files have different hashvalues has the following bounds:

1−N(N − 1)

2M≤ P (N,M) ≤ e−

N(N−1)2M .

Homework exercise: Show that P (N,M) can only decrease if the hashfunction deviates from random oracle, i.e. the output probabilities differfrom 1

M .

7

Page 8: Hash Functions: lecture series by Ahto Buldas

Lecture 1 on Hash Functions July 26, 2011

A Useful Inequality from Analysis

We used the inequality 1+x ≤ ex that holds for all real x. This also meansthat 1− x ≤ e−x.

The inequality holds because ex is convex and 1 + x is the tangent line ofex at x = 0.

8

Page 9: Hash Functions: lecture series by Ahto Buldas

Lecture 1 on Hash Functions July 26, 2011

Upper Bound

For the upper bound, we use the observation that

P (N,M) =(

1−1

M

)·(

1−2

M

)· . . . ·

(1−

N − 1

M

)≤ e−1/M · e−2/M · . . . · e−(N−1)/M = e−(1+2+...+N−1)/M

= e−N(N−1)/2M .

Explanation: We compute the hashes of the N files one by one, consid-ering each hash computation as a uniformly random selection of a hashvalue. Before the i-th selection, we already have i − 1 selected hash val-ues and the probability that the new hash is one of those is i−1

M .

9

Page 10: Hash Functions: lecture series by Ahto Buldas

Lecture 1 on Hash Functions July 26, 2011

Lower Bound

To get a lower bound for P (N,M), we first observe that for any pair offiles, the probability that they have the same hash is 1/M . As there areN(N − 1)/2 pairs, the probability of having a collision is upper boundedby N(N−1)

2M and hence,

P (N,M) ≥ 1−N(N − 1)

2M.

10

Page 11: Hash Functions: lecture series by Ahto Buldas

Lecture 1 on Hash Functions July 26, 2011

Examples

For large N ≈√M , we have

1

2≤ limM→∞

P (√M,M) ≤ e−1/2 ≈ 0.6065306597 .

For the Birthday paradox, we set N = 25 and M = 365 and get

P (N,M) ≤ 0.4395878005 .

11

Page 12: Hash Functions: lecture series by Ahto Buldas

Lecture 2 on Hash Functions July 26, 2011

Lecture 2:Cryptographic Security Requirements

for Hash Functions

We study how to protect the uniqueness of identifiers against maliciousbehavior of users. Cryptographic hash functions are designed for thispurpose. We describe the main properties that are required from crypto-graphic hash functions: one-wayness, second pre-image resistance, andcollision resistance. We go through some examples to show where thesesecurity properties are needed in practice.

Page 13: Hash Functions: lecture series by Ahto Buldas

Lecture 2 on Hash Functions July 26, 2011

Three Main Security Properties

Cryptographic hash functions are designed for having the following secu-rity properties:• One-wayness—hardness of finding a message given its hash.• Second pre-image resistance—hardness of modifying a message with-out changing its hash.• Collision-freeness—hardness of finding two different messages with thesame hash.

Collision freeness is the strongest requirement of the three, because theadversary has the largest degree of freedom.

Note that CRC32 does NOT meet any of the three requirements! It is in-tended to fight against occasional errors, not malicious attacks.

13

Page 14: Hash Functions: lecture series by Ahto Buldas

Lecture 2 on Hash Functions July 26, 2011

One-Wayness (or Pre-Image Resistance)

Given a hash h(X) of a randomly chosen input X, it is hard to find aninput X ′ with the same output h(X ′) = h(X). Here, X is not known tothe adversary!

14

Page 15: Hash Functions: lecture series by Ahto Buldas

Lecture 2 on Hash Functions July 26, 2011

Second Pre-Image Resistance

Given a randomly chosen inputX, it is hard to find a different inputX ′ 6= X

with the same output h(X ′) = h(X). Here, X is known to the adversary!

15

Page 16: Hash Functions: lecture series by Ahto Buldas

Lecture 2 on Hash Functions July 26, 2011

Collision Resistance (or Collision-Freeness)

It is hard to find two distinct inputs X 6= X ′ with the same output h(X ′) =

h(X).

16

Page 17: Hash Functions: lecture series by Ahto Buldas

Lecture 2 on Hash Functions July 26, 2011

Examples for One-Wayness and Second

Pre-Image Resistance

We need one-wayness in a situation, where X contains secret data. Wehave to uniquely identify X but do not want that the identifier reveals thecontents of X.

We need second pre-image resistance if we want to protect a public doc-ument X from malicious modifications. For example, if we keep the hashh(X) of X in a safe place, it should be impossible to create a modifieddocument X ′ with the same hash, because then the modifications wereundetectable.

17

Page 18: Hash Functions: lecture series by Ahto Buldas

Lecture 2 on Hash Functions July 26, 2011

Digital Signatures and Collision Resistance

We create two contractsX andX ′ with the same hash, whereX ′ is clearlyfavorable to us.

We show X to our partner who then signs the hash h(X).

Later, we show X ′ in court with the partner’s signature on h(X ′) = h(X)and claim that he/she accepted the conditions of X ′.

18

Page 19: Hash Functions: lecture series by Ahto Buldas

Lecture 2 on Hash Functions July 26, 2011

Additional Remark about Digital Signatures

It may seem that in the last example, we can overcome such a situation bystating (in law) that a signature that uses h is invalid whenever one comesup with a collision for h.

However, this would lead to another way of deception. We prepare to con-tracts X and X ′, where X is the contract we actually sign and X ′ is an-other contract with the same hash h(X ′) = h(X) but which is much moreprofitable to us than X.

Our intention is to deny having signed X by showing in court the otherversion X ′ and claiming that X ′ was actually the contract we wanted tosign.

19

Page 20: Hash Functions: lecture series by Ahto Buldas

Lecture 3 on Hash Functions July 26, 2011

Lecture 3:Relations Between the Security Properties

of Hash Functions

It turns out that some of the security properties for hash functions canbe derived from others. In this lecture, we prove mathematically that thecollision-resistance is the strongest requirement of the tree because it im-plies both the one-wayness and the second pre-image resistance.

Page 21: Hash Functions: lecture series by Ahto Buldas

Lecture 3 on Hash Functions July 26, 2011

Relative Security Proofs by Reductions

To prove relations between security properties, we use reductions.

To prove that property P implies property Q, we use the contraposition:

If there is an efficient A that breaks Q, we construct an efficient A′ thatbreaks P.

21

Page 22: Hash Functions: lecture series by Ahto Buldas

Lecture 3 on Hash Functions July 26, 2011

Collision Resistance Implies Second Pre-Image

Resistance.

Every adversary A that finds second pre-images for h can be modified toa collision finding adversary A′:• A′ first generates a random input X and then• uses A to find an X ′ 6= X such that h(X ′) = h(X).

22

Page 23: Hash Functions: lecture series by Ahto Buldas

Lecture 3 on Hash Functions July 26, 2011

Collision Resistance Implies One-Wayness ...

... if the hash function is sufficiently compressing. Every adversary A thatis able to invert h can be modified to a collision finding adversary A′:• A′ generates a random X, computes y = h(X) and• uses A to find X ′ such that h(X ′) = h(X).

If h is sufficiently compressing then Pr[X ′ = X] is small. Hence, A′ verylikely finds a collision when A is successful.

23

Page 24: Hash Functions: lecture series by Ahto Buldas

Lecture 3 on Hash Functions July 26, 2011

Mathematical Details

Direct computation of the success of A′ is not that trivial because X ′ de-pends on X, and we cannot view X and X ′ as independent random vari-ables. However, if the output value y is known and fixed then X ′ becomesindependent of X. Indeed, we can regenerate X with a uniform choicefrom the pre-image of y, and by doing so we do not change the distributionof X. Hence, the work of A′ is equivalent to the following (inefficient!) pro-cedure:• Choose a uniformly random input Z and compute y = h(Z).• Choose a uniformly random X from the h-preimage of y.• Apply A to obtain X ′ = A(y).

24

Page 25: Hash Functions: lecture series by Ahto Buldas

Lecture 3 on Hash Functions July 26, 2011

The steps 2 and 3 are independent random choices and hance the proba-bility C(y) that a collision is produced for y is:

C(y) = Pr[A inverts y] · Pr[X 6= X ′] = Pr[A inverts y] ·|h−1(y) | −1

|h−1(y) |

= Pr[A inverts y] ·(

1−1

|h−1(y) |

).

24

Page 26: Hash Functions: lecture series by Ahto Buldas

Lecture 3 on Hash Functions July 26, 2011

The overall success δA′ of A′ is the average of this probability:

δA′ =∑y

Pr[y] · C(y) =∑y

Pr[y] · Pr[A inverts y] ·(

1−1

|h−1(y) |

)

=∑y

Pr[y] · Pr[A inverts y]︸ ︷︷ ︸δA

−∑y

Pr[A inverts y] · Pr[y] ·1

|h−1(y) |︸ ︷︷ ︸1/N

= δA −1

N·∑y

Pr[A inverts y]︸ ︷︷ ︸≤M

≥ δA −M

N,

where δA is the success probability of A, and N,M are the input set sizeand the output set size, respectively.

24

Page 27: Hash Functions: lecture series by Ahto Buldas

Lecture 3 on Hash Functions July 26, 2011

Insufficiently Compressing Hash Functions

Insufficiently compressing function can be collision-free but not one-way.

If there exists a collision free hash function h [with (k− 2)-bit output] thenthere exists a function h′ that is collision-free but not one-way.

We define h′ so that it returns a (k−1)-bit output when given a k-bit input.

h′(x) =

{1‖z, if x = 11z0‖h(x), otherwise

25

Page 28: Hash Functions: lecture series by Ahto Buldas

Lecture 3 on Hash Functions July 26, 2011

• Finding collisions for h′ is as hard as finding collisions for h.• Inverting h′ is relatively easy, because with probability 1/4, a random x

is of the form 11z and the output 1z of h′ completely reveals the input.

25

Page 29: Hash Functions: lecture series by Ahto Buldas

Lecture 4 on Hash Functions July 26, 2011

Lecture 4: Provably Secure vs PracticalHash Functions

We study two main goals when constructing a hash function. Provably se-cure hash functions are more reliable but inefficient. In practice, we usemore efficient ad hoc designs, which have no guarantees against discover-ing more and more efficient attacks. In this lecture, we present some basicideas how to construct provably secure hash functions. We also character-ize the practical hash functions and summarize how secure some of themare, considering the known attacks against them.

Page 30: Hash Functions: lecture series by Ahto Buldas

Lecture 4 on Hash Functions July 26, 2011

The Idea of Provably Secure Hash Functions

The security of a hash function h can be based on a hard mathematicalproblem if we can prove that finding collisions for h is as hard as solvingthe problem.

This gives us much stronger security than just relying on complex mixingof bits.

A cryptographic hash function h has provable security against collision at-tacks if finding collisions is reducible to a problem P which is supposed notto be efficiently solvable. The function h is then called provably secure.

Proof by Reduction: If finding collisions if feasible by algorithm A, we couldfind and use an efficient algorithm S (that uses A) to solve P , which is sup-posed to be unsolvable. That is a contradiction. Hence, finding collisionscannot be easier than solving P .

27

Page 31: Hash Functions: lecture series by Ahto Buldas

Lecture 4 on Hash Functions July 26, 2011

Some Hard Mathematical Problems for Provably

Secure Hashing

Provably secure hash functions are based on hard mathematical problemsthat are mostly related to well-known mathematical functions.

Some problems, that are assumed not to be efficiently solvable:• Discrete Logarithm Problem: Given ax mod p, find x, where p is a largeprime number.• Finding Modular Square Roots: Given a2 mod n, find a, where n is ahard to factorize composite number.• Integer Factorization Problem: Given n = p · q, find p and q, where p, qare two large primes.

28

Page 32: Hash Functions: lecture series by Ahto Buldas

Lecture 4 on Hash Functions July 26, 2011

Example of a Provably Secure Hash Function

Modular exponent function:

h(x) = ax mod n,

where n is a hard to factor composite number, and a is a pre-specifiedbase value.

A collision ax ≡ ax′

(mod n) reveals a multiple x − x′ of the order of a(in the multiplicative group of residues modulo n). Such information can beused to factor n assuming certain properties of a.

Such an h is inefficient because it requires 1.5 multiplications per input-bit.

29

Page 33: Hash Functions: lecture series by Ahto Buldas

Lecture 4 on Hash Functions July 26, 2011

Design Principles for Practical Hash Functions

Practical hash functions are combinations of bit operations which are hardto study, partly because of their ”ugly” mathematical properties.

The avalanche criterion requires that if an input is changed slightly (forexample, flipping a single bit) the output must change significantly (e.g.,many output bits flip). First used by Horst Feistel (1915-1990).

The Strict avalanche criterion (SAC) is a generalization of the avalanchecriterion. It is satisfied if, whenever a single input bit is complemented,each of the output bits changes with probability 1

2. Introduced by Websterand Tavares in 1985.

The bit independence criterion (BIC)—two output bits should change inde-pendently when any single bit (not among these two) is inverted.

30

Page 34: Hash Functions: lecture series by Ahto Buldas

Lecture 4 on Hash Functions July 26, 2011

Merkle-Damgard Design

A hash function must be able to process an arbitrary-length message intoa fixed-length output. This is achieved by breaking the input up into blocks.

The last block processed should also be unambiguously length padded.This construction is called the Merkle-Damgard construction. Most widelyused hash functions, including SHA-1 and MD5, take this form.

31

Page 35: Hash Functions: lecture series by Ahto Buldas

Lecture 4 on Hash Functions July 26, 2011

Merkle-Damgard construction is as resistant to collisions as is its compres-sion function—every collision for the full hash function can be traced backto a collision in the compression function.

The construction has certain inherent flaws, including length-extension andgenerate-and-paste attacks, and cannot be parallelized.

Many candidate hash functions in the current NIST competition are built ondifferent constructions.

31

Page 36: Hash Functions: lecture series by Ahto Buldas

Lecture 4 on Hash Functions July 26, 2011

Practical Hash Functions and their Known Strength

32

Page 37: Hash Functions: lecture series by Ahto Buldas

Lecture 5 on Hash Functions July 26, 2011

Lecture 5:Time-stamping as an Application for Hash

Functions

Time-stamping is one of the areas where cryptographic hash functions areused. We introduce the basic concepts of time-stamping and describe howthe hash functions can be and are used to create efficient practical designsfor secure time-stamping services.

Page 38: Hash Functions: lecture series by Ahto Buldas

Lecture 5 on Hash Functions July 26, 2011

Motivation

Proving that a document (file) existed at certain time.

Usage examples:• Intellectual property—knowing the earliest time stamp for an inventionwould be a strong argument for the authorship.• Revocation of signature keys—when a signature key has been revoked,it should still be possible to distinguish between the signatures createdbefore the revocation and those created after.

Note that if you just having a time stamp for a document does not provethat you knew the contents of the documents at that time. Knowing a hashof a document does not imply the knowledge of contents.

Time stamps are not intended to prove the ownership of information butrather the existence (i.e. that somebody or something had the contents).

34

Page 39: Hash Functions: lecture series by Ahto Buldas

Lecture 5 on Hash Functions July 26, 2011

Trivial Solution with a Trusted Server

Trivial solution may be (and is) used which involves a trusted server, whoreceives hash values of documents from clients and signs them togetherwith current time. The main shortcomings of this solution are:•We have to trust the server completely; the server can never be assumedto be malicious.• If the signature key of the server accidentally becomes public, then allthe previous time stamps become useless.

35

Page 40: Hash Functions: lecture series by Ahto Buldas

Lecture 5 on Hash Functions July 26, 2011

Hash and Publish Solution

A hash of a document (or a hash of a batch of documents) is published ina widely witnessed way like in newspapers or in public web-pages.

This provides an irrefutable proof of existence of those documents at thetime of publishing.

36

Page 41: Hash Functions: lecture series by Ahto Buldas

Lecture 5 on Hash Functions July 26, 2011

Merkle Tree: How to publish many hashes?

37

Page 42: Hash Functions: lecture series by Ahto Buldas

Lecture 5 on Hash Functions July 26, 2011

Time Stamp = Hash Chain

For example, for Document 1, the time stamp contains:• hash 00, to establish a connection between Document 1 and hash 0.• hash 1, to establish a connection between hash 0 and root hash.

Such time stamps are compact because the number of hashes in a timestamp is logarithmic in the number of leaves.

38

Page 43: Hash Functions: lecture series by Ahto Buldas

Lecture 5 on Hash Functions July 26, 2011

Efficient Hash and Publish with Merkle Trees

A hash tree is created every second and the root hash values are publishedin a widely witnessed way.

39

Page 44: Hash Functions: lecture series by Ahto Buldas

Lecture 6 on Hash Functions July 26, 2011

Lecture 6: Security Requirements forHash Functions Used in Time-Stamping

What do we require from a secure time-stamping solution and what kind ofcryptographic hash functions do we need? It turned out that it is not easyto consistently formalize the concept of a secure time-stamping scheme.It also turned out that the three classical security requirements for hashfunctions are not exactly what we need in time-stamping schemes. Weexplain why the early attempts to formalize the security of time-stampingfailed and outline the consistent definitions that are known today.

Page 45: Hash Functions: lecture series by Ahto Buldas

Lecture 6 on Hash Functions July 26, 2011

An Early Attempt to Define a Security Condition

In 1997, Haber and Stornetta proposed the following adversary A:• A sends requests x1, . . . , xm to a server, and after the server has cre-ated a hash tree and published the root hash r,• A comes up with x 6∈ {x1, . . . , xm} and a hash chain from x to r.

41

Page 46: Hash Functions: lecture series by Ahto Buldas

Lecture 6 on Hash Functions July 26, 2011

Inconsistency of the Condition

Adversary may send a request x1 = h(x, x′), so that x 6∈ {x1, . . . , xm}.A hash chain from x1 to r can easily be extended for x by just adding astep that contains x′ and establishes a connection between x and x1.

The security condition is violated, but nothing is back-dated. The requestx should be ”new” and not having involved in pre-computations.

42

Page 47: Hash Functions: lecture series by Ahto Buldas

Lecture 6 on Hash Functions July 26, 2011

Unpredictable Adversaries

An improved security condition proposed by Buldas and Saarepera (2004)and refined by Buldas and Laur (2006) assumes that the new request x ischosen (by the adversary) in a random way.

More formally, it is assumed that the new request should be unpredictableeven having all data that the adversary had in the first stage of the attack.

43

Page 48: Hash Functions: lecture series by Ahto Buldas

Lecture 6 on Hash Functions July 26, 2011

This means that any efficient predicting algorithm Π that hash access to allcomputations of the first stage of the adversary can guess the value of x(computed by the second stage of A) only with negligible probability.

43

Page 49: Hash Functions: lecture series by Ahto Buldas

Lecture 6 on Hash Functions July 26, 2011

Improved Condition with Unpredictable Adversaries

Time-stamping scheme should be secure against all efficient unpredictableadversaries A:• A sends requests x1, . . . , xm to server, and after the server has createda hash tree and published the root hash r,• A outputs an unpredictable request x and a hash chain from x to r.

44

Page 50: Hash Functions: lecture series by Ahto Buldas

Lecture 6 on Hash Functions July 26, 2011

As A is unpredictable, the condition x 6∈ {x1, . . . , xm} is no more neces-sary. Moreover, this would be very hard to test in practice because the listof all requests is too large and cannot be examined by any real observer.Also, A may just send a one single request r to the service without reveal-ing how it is computed. Hence, the attack scenario reduces to:• A publishes r,• A comes up with an unpredictable request x together with a correct hashchain from x to r.

44

Page 51: Hash Functions: lecture series by Ahto Buldas

Lecture 6 on Hash Functions July 26, 2011

Collision-Resistance is Still Insufficient

Improper computation of r may help to back-date new requests withoutexposing collisions.

It might be possible to compute root hash values for very large hash treeswithout actually computing all nodes of the tree.

If you know that 1 + 2 + . . .+ 100 = 5050 this does not mean that youactually summed up all these integers—you might use the multiplicationinstead: 1 + 2 + . . .+ 100 = 100·101

2 = 5050.

Buldas and Saarepera (2004): if collision-resistant functions exist, thenthere is a collision-free hash function that enables back-dating random doc-uments.

45

Page 52: Hash Functions: lecture series by Ahto Buldas

Lecture 6 on Hash Functions July 26, 2011

Hash Chain Limitations as a Solution

The hash tree used for time-stamping should somehow be restricted (forexample, length limitations to hash chains, fixed shape of the tree, etc.).

First correct security proof for restricted schemes was given by Buldas andSaarepera (2004).

For example, tn the Guardtime system, there are explicit length restrictionsfor hash chains:• Each hash step includes a length byte that represent the number of pre-ceding steps in the hash chain.• Verification procedure checks, whether the length bytes are in increasingorder.

46

Page 53: Hash Functions: lecture series by Ahto Buldas

Lecture 6 on Hash Functions July 26, 2011

Why Collision-Resistance and One-Wayness are

Unnecessary?

Say, we can find a collision h(X) = h(X ′) where X and X ′ are twodifferent documents and obtain a time stamp for h(X).

The fact that we also get a time stamp for X ′ this way should not be con-sidered as an attack against the time-stamping scheme, because nothingis back-dated!

The adversary can only get two time stamps for the price of one!

Buldas and Laur (2006): if there exists a hash function h that is secure fortime-stamping schemes, then there is a hash function h′ that is secure fortime-stamping but is not collision-resistant and not even one-way.

47

Page 54: Hash Functions: lecture series by Ahto Buldas

Lecture 6 on Hash Functions July 26, 2011

Such a hash function h′ can be defined as follows:

h′(x) =

{y if x = y‖yh(x) otherwise

i.e. h′ is the same as h, except that h′(yy) = y for any y.

The point being that whenever there is a hash chain (produced by theadversary) that contains h′-specific hash steps in the form h′(yy) = y,then we can simply delete such steps without breaking the correctness(changing the output value) of the hash chain.

Hence, the modification h 7→ h′ cannot be the cause of insecurity—if back-dating is easy with h′ then it is also easy with h.

47

Page 55: Hash Functions: lecture series by Ahto Buldas

Lecture 7 on Hash Functions July 26, 2011

Lecture 7:Security Proofs for Time-Stamping

Secure time-stamping schemes require hash functions with security prop-erties that are not directly related to the three classical security goals forcryptographic hash functions: one-wayness, second pre-image resistance,and collision resistance. Hence, there are no guarantees that such func-tions are secure for time-stamping. In this lecture, we will see how to de-fine restrictions in time-stamping schemes so that the classical securityproperties of hash become sufficient for their used in the (restricted) time-stamping schemes. Based on the known research results in this area,we draw some conclusions about the security of practical time-stampingschemes.

Page 56: Hash Functions: lecture series by Ahto Buldas

Lecture 7 on Hash Functions July 26, 2011

Collision Extraction From Hash Chains

The proof from Asiacrypt 2004 uses the following fact on hash functions:

If there are two hash chains of the same shape that have different inputsx′ and x and the same output y, then having those two hash chains, it iseasy to extract a collision for the hash function.

Page 57: Hash Functions: lecture series by Ahto Buldas

Lecture 7 on Hash Functions July 26, 2011

Two Chains with Different Shape

Two chains from different shape do not necessarily produce a collision. Forexample, if they belong to one and the same tree:

Page 58: Hash Functions: lecture series by Ahto Buldas

Lecture 7 on Hash Functions July 26, 2011

Outline of the Security Proof from Asiacrypt 2004

The collision-finder A′ executes the first stage of the adversary to producea published hash value r, and then executes the second stage twice inorder to have to successfully back-dated requests x and x′ with two hashchains of the same shape.

Page 59: Hash Functions: lecture series by Ahto Buldas

Lecture 7 on Hash Functions July 26, 2011

Efficiency of the Collision Finder

If there are N allowed shapes for hash chains, and δ1,...,δN are the prob-abilities that A succeeds with a hash chain of the corresponding shape,then the overall success probability of A is δ = δ1 + ...+ δN .

As the values x and x′ are unpredictable, Pr[x = x′] is negligible. Hence,the success of the collision-finding adversary is about

δ′ ≈ δ21 + ...+ δ2

N = N ·δ2

1 + ...+ δ2N

N≥ N ·

(δ1 + ...+ δN

N

)2=δ2

N.

As the running time t′ of the collision finder A′ is about twice the runningtime of A, we have that the time-success ratio of A′ is:

t′

δ′≈ 2N ·

t

δ2,

which means that if we want to have a t/δ-secure time stamping scheme,we have to use a 2N · t

δ2-secure hash function.

Page 60: Hash Functions: lecture series by Ahto Buldas

Lecture 7 on Hash Functions July 26, 2011

An Improved Reduction

A recently published security proof by Buldas and Niitsoo (2010) estab-lishes a security reduction in which (for having t/δ-security), the hash func-tion must be about 14 ·

√N · t

d1.5-secure, i.e.

t′

δ′≈ 14

√N ·

t

δ1.5,

which is much more efficient than in the previous reductions.

This is the first security reduction that implies the security of global scaletime-stamping schemes that use 256-bit hash functions.

Page 61: Hash Functions: lecture series by Ahto Buldas

Lecture 7 on Hash Functions July 26, 2011

The collision-finding adversary executes the second stage of the collision-finding adversary m times (instead of two, as in the proof of Asiacrypt2004), with m =

√Nδ where δ is the success of A.

Page 62: Hash Functions: lecture series by Ahto Buldas

Lecture 7 on Hash Functions July 26, 2011

Conclusions for Practical Schemes

What is N for practical schemes? It is not constant but grows over time.

If a new tree of size C is produced every second, then N = τ · C, whereτ is the running time of the adversary, measured in seconds.

In security proofs, however, the running time t of A is measured in compu-tational steps.

World’s total computational power :It has been estimated that the compu-tational power of the whole world is close to 262 operations per seconds(from 2007). We assume that this has been grown to about P = 264.

Page 63: Hash Functions: lecture series by Ahto Buldas

Lecture 7 on Hash Functions July 26, 2011

It is natural to assume that t ≈ 264 · τ , and hence

N ≈C · tP

.

The efficiency formulae for the two reductions are as follows:

t′δ′ ≈ 2CP ·

(tδ

)2(2004),

t′δ′ ≈ 14

√CP ·

(tδ

)1.5(2010)

Page 64: Hash Functions: lecture series by Ahto Buldas

Lecture 7 on Hash Functions July 26, 2011

Conclusions for Practical Schemes

For having a tδ-secure time-stamping scheme, we have to use t′

δ′-securehash functions, which means that if they are secure (to the Birthday barrier)their output in bits should be

k ≈ 2 · log2t′

δ′.

By denoting c = log2C and s = log2tδ , we deduce from the efficiency

formulae that:

k ≈ 2(1 + c− 64 + 2s) = 2c+ 4s− 126 (2004)

k ≈ 2(4 + c2 − 32 + 1.5s) = c+ 3s− 56 (2010)

Page 65: Hash Functions: lecture series by Ahto Buldas

Lecture 7 on Hash Functions July 26, 2011