18
Int. J. Inf. Secur. (2012) 11:85–102 DOI 10.1007/s10207-012-0155-8 REGULAR CONTRIBUTION Short collusion-secure fingerprint codes against three pirates Koji Nuida Published online: 4 February 2012 © Springer-Verlag 2012 Abstract In this article, we propose a new construction of probabilistic collusion-secure fingerprint codes against up to three pirates and give a theoretical security evaluation. Our pirate tracing algorithm combines a scoring method analo- gous to Tardos codes (J ACM 55:1–24, 2008) with an exten- sion of parent search techniques of some preceding 2-secure codes. Numerical examples show that our code lengths are significantly shorter than (about 30–40% of) the shortest known c-secure codes by Nuida et al. (Des Codes Cryptogr 52:339–362, 2009) with c = 3. Keywords Collusion-secure code · Fingerprint code · 3-secure code · Content protection 1 Introduction 1.1 Background and related works Recently, digital content distribution services have been widespread by virtue of progress in information technol- ogy. Digitization of content distribution has improved con- venience for ordinary people. However, the digitization also enables malicious persons to perform more powerful attacks, and the amount of illegal content redistribution is increasing very rapidly. Hence technical countermeasures for such ille- gal activities are strongly desired. A use of fingerprint code is a possible solution for such problems, which aims at giving traceability of the attacker (pirate) when an illegally redis- K. Nuida (B ) Research Center for Information Security (RCIS), National Institute of Advanced Industrial Science and Technology (AIST), AIST Tsukuba Central 2, 1-1-1 Umezono, Tsukuba, Ibaraki 305-8568, Japan e-mail: [email protected] tributed digital content is found, thus letting the potential attackers abandon to perform actual attacks. In the context of fingerprint codes, each copy of a con- tent is divided into several segments (common to all copies), in each of which a bit of an encoded user ID is embedded by the content provider by using watermarking technique. The embedded encoded ID (fingerprint) provides traceabil- ity of an adversarial user (pirate) when an unauthorized copy of the content is distributed. Such a scheme aims at trac- ing some pirate, without falsely tracing any innocent user, from the fingerprint embedded in the pirated content with an overwhelming probability. It has been noticed that a coali- tion of pirates can perform certain strong attacks (collusion attacks) to the fingerprint, therefore any effective fingerprint code should be secure against collusion attacks, called collu- sion-secure codes. In particular, if the code is secure against collusion attacks by up to c pirates, then the code is called c-secure [5]. Several constructions of collusion-secure codes have been proposed so far. Among them, the one proposed by Tardos [24] is “asymptotically optimal”, in the sense that the order of his code length with respect to the allowable number c of pirates is theoretically the lowest (which is quadratic in c). For improvements of Tardos codes, the constant factor of the asymptotic code length has been reduced by c-secure codes given by Nuida et al. [15] to approximately 5.35% of Tardos codes, which is the smallest value so far prov- able without any additional assumption. There are also other improvements of Tardos codes whose lengths are longer than Nuida’s code [4, 21, 23] or whose security proofs depend on certain statistical assumptions on the behavior of codewords [21, 23]. Moreover, variants of Tardos codes over large alpha- bets (instead of binary ones as in [15, 24]) have been proposed as well [4, 21, 22]. Some recent works [1, 2, 8, 9] focused on the achievable rates (or “fingerprinting capacity”) of 123

Short collusion-secure fingerprint codes against three pirates

Embed Size (px)

Citation preview

Page 1: Short collusion-secure fingerprint codes against three pirates

Int. J. Inf. Secur. (2012) 11:85–102DOI 10.1007/s10207-012-0155-8

REGULAR CONTRIBUTION

Short collusion-secure fingerprint codes against three pirates

Koji Nuida

Published online: 4 February 2012© Springer-Verlag 2012

Abstract In this article, we propose a new construction ofprobabilistic collusion-secure fingerprint codes against up tothree pirates and give a theoretical security evaluation. Ourpirate tracing algorithm combines a scoring method analo-gous to Tardos codes (J ACM 55:1–24, 2008) with an exten-sion of parent search techniques of some preceding 2-securecodes. Numerical examples show that our code lengths aresignificantly shorter than (about 30–40% of) the shortestknown c-secure codes by Nuida et al. (Des Codes Cryptogr52:339–362, 2009) with c = 3.

Keywords Collusion-secure code · Fingerprint code ·3-secure code · Content protection

1 Introduction

1.1 Background and related works

Recently, digital content distribution services have beenwidespread by virtue of progress in information technol-ogy. Digitization of content distribution has improved con-venience for ordinary people. However, the digitization alsoenables malicious persons to perform more powerful attacks,and the amount of illegal content redistribution is increasingvery rapidly. Hence technical countermeasures for such ille-gal activities are strongly desired. A use of fingerprint code isa possible solution for such problems, which aims at givingtraceability of the attacker (pirate) when an illegally redis-

K. Nuida (B)Research Center for Information Security (RCIS), National Institute ofAdvanced Industrial Science and Technology (AIST), AIST TsukubaCentral 2, 1-1-1 Umezono, Tsukuba, Ibaraki 305-8568, Japane-mail: [email protected]

tributed digital content is found, thus letting the potentialattackers abandon to perform actual attacks.

In the context of fingerprint codes, each copy of a con-tent is divided into several segments (common to all copies),in each of which a bit of an encoded user ID is embeddedby the content provider by using watermarking technique.The embedded encoded ID (fingerprint) provides traceabil-ity of an adversarial user (pirate) when an unauthorized copyof the content is distributed. Such a scheme aims at trac-ing some pirate, without falsely tracing any innocent user,from the fingerprint embedded in the pirated content with anoverwhelming probability. It has been noticed that a coali-tion of pirates can perform certain strong attacks (collusionattacks) to the fingerprint, therefore any effective fingerprintcode should be secure against collusion attacks, called collu-sion-secure codes. In particular, if the code is secure againstcollusion attacks by up to c pirates, then the code is calledc-secure [5].

Several constructions of collusion-secure codes have beenproposed so far. Among them, the one proposed by Tardos[24] is “asymptotically optimal”, in the sense that the orderof his code length with respect to the allowable number cof pirates is theoretically the lowest (which is quadratic inc). For improvements of Tardos codes, the constant factorof the asymptotic code length has been reduced by c-securecodes given by Nuida et al. [15] to approximately 5.35%of Tardos codes, which is the smallest value so far prov-able without any additional assumption. There are also otherimprovements of Tardos codes whose lengths are longer thanNuida’s code [4,21,23] or whose security proofs depend oncertain statistical assumptions on the behavior of codewords[21,23]. Moreover, variants of Tardos codes over large alpha-bets (instead of binary ones as in [15,24]) have been proposedas well [4,21,22]. Some recent works [1,2,8,9] focusedon the achievable rates (or “fingerprinting capacity”) of

123

Page 2: Short collusion-secure fingerprint codes against three pirates

86 K. Nuida

collusion-secure codes in asymptotic cases, rather than con-crete bounds of error probabilities in practical cases.

On the other hand, after the first proposal of Tardos codes,there were proposed several collusion-secure codes [3,6,11,14,16] that restrict the number of pirates to c = 2 but achievefurther short code lengths. Such constructions of short c-secure codes for a small c would have not only theoreticalbut also practical importance; for example, when the usersare less anonymous for the content provider (e.g., the case ofsecret documents distributed in a company), it seems infea-sible to make a large coalition confidentially. The aim of thisarticle is to extend such a “compact” construction to the nextcase c = 3.

For related works, we notice that there is an earlier workby Schaathun [17,18] on construction of 3-secure codes (theproposal of 3-secure codes by Sebé and Domingo-Ferrer [20]was earlier than Schaathun’s one, but it was shown later tobe insecure by Schaathun [19]). On the other hand, there isanother work by Kitagawa et al. [10] on construction of 3-secure codes, in which very short code lengths are proposedbut its security is evaluated only by computer experimentsfor some special attack strategies.

1.2 Our contribution

In this article, we propose a new construction of 3-securecodes and give a theoretical security evaluation. The code-word generation algorithm is just a bit-wise random sam-pling, which has been used by many preceding constructionsas well. The novel point of our construction is in the piratetracing algorithm, which combines the use of score computa-tion analogous to Tardos codes [24] with an extension of “par-ent search” technique of some preceding works against twopirates [3,11,16]. Intuitively, the score computation methodworks well when the parts of fingerprint in the pirated contentare not chosen evenly from the codewords of pirates, whilethe extended “parent search” technique works well when thefingerprint is evenly chosen from the codewords of pirates,therefore their combination is effective.

In comparison under some parameter choices, our codelengths are approximately 1–1.5% of 3-secure codes bySchaathun [17,18], and approximately 30–40% of c-securecodes by Nuida et al. [15] for c = 3. This shows that our codelength is even significantly shorter than the shortest knownc-secure codes [15]. See Sect. 3.5 for details of comparisonwith preceding results.

In fact, Kitagawa et al. [10] claimed that their 3-securecode provides almost the same security level as our code forthe case of 100 users and 128-bit length. However, they eval-uated the security by only computer experiments for the caseof some special attack algorithms (and they studied just oneparameter choice as above), while in this article we give atheoretical security evaluation for arbitrary attack algorithms

under the standard Marking Assumption (cf., [5]). (One maythink that the perfect protection of so-called undetectablepositions required by Marking Assumption is not practical.However, this is in fact not a serious problem, as a generalconversion technique recently proposed by Nuida [12] cansupply robustness against erasure of a bounded number ofundetectable bits.)

1.3 Notations

In this article, log denotes the natural logarithm. We put[n] = {1, 2, . . . , n} for an integer n. Unless some ambi-guity emerges, we often abbreviate a set {i1, i2, . . . , ik} toi1i2 · · · ik , and when we use this abbreviation, we implicitlyassume that i1, i2, . . . , ik are all different. Let δa,b denoteKronecker delta, that is, we have δa,b = 1 if a = b andδa,b = 0 if a �= b. For a family F of sets, let

⋃ F and⋂ F denote the union and the intersection, respectively, ofall members of F .

1.4 Organization of the article

In Sect. 2, we give a formal definition of the notion of col-lusion-secure fingerprint codes. In Sect. 3, we describe ourcodeword generation algorithm and pirate tracing algorithm,state the main results on the security of our 3-secure codes,and give some numerical examples for comparison to preced-ing works. Section 4 summarizes the outline of the securityproof. Finally, Sect. 5 supplies the details of proofs omittedin Sects. 3 and 4.

2 Collusion-secure fingerprint codes

In this section, we introduce formal definitions for fingerprintcodes. Let N and m be positive integers, and 1 ≤ c ≤ N aninteger parameter. Put U = [N ]. Fix a symbol “?” differentfrom “0” and “1”. We start with the following definition:

Definition 1 Given the parameters N , m and c, we define thefollowing game, which we refer to as pirate tracing game.The players of the game is a provider and pirates, and thegame is proceeded as follows:

1. Provider generates an N × m binary matrix W =(wi, j )i∈[N ], j∈[m] and an element st called state infor-mation.

2. Pirates generate UP ⊆ U, 1 ≤ |UP| ≤ c, without know-ing W and st.

3. Pirates receive the codeword wi = (wi,1, . . . , wi,m) forevery i ∈ UP.

4. Pirates generate a word y = (y1, . . . , ym) on {0, 1, ?}under a certain restriction specified below, and send y toprovider.

123

Page 3: Short collusion-secure fingerprint codes against three pirates

Short collusion-secure fingerprint codes 87

5. Provider generates Acc ⊆ U from y, W , and st, withoutknowing UP.

6. Then pirates win if Acc ∩ UP = ∅ or Acc �⊆ UP, andotherwise provider wins.

We call the word y in Step 4 an attack word and call “?” anerasure symbol. Put UI = U\UP. In the definition, U signi-fies the set of all users, UP is the coalition of pirates, and UI

is the set of innocent users. The codeword wi signifies thefingerprint for user i , and the word y signifies the fingerprintembedded in the pirated content. The set Acc consists of theusers traced by the provider from the pirated content. Theevents Acc ∩ UP = ∅ and Acc �⊆ UP specified in Step 6are referred to as false-negative and false-positive (or false-alarm), respectively. Both of false-negative and false-positiveare called tracing error.

Let Gen, Reg, ρ, and Tr denote the algorithms used inSteps 1, 2, 4, and 5, respectively. We call Gen, Reg, ρ, andTr codeword generation algorithm, registration algorithm,pirate strategy, and tracing algorithm, respectively. We referto the pair C = (Gen, Tr) as a fingerprint code, and the fol-lowing quantity

Pr [(W, st)← Gen(); UP ← Reg();y← ρ(UP, (wi )i∈UP);Acc← Tr(y, W, st) : Acc ∩UP = ∅ or Acc �⊆ UP] (1)

(i.e., the overall probability that pirates win) is called an errorprobability of C.

In order to specify the restriction for y mentioned in Step 4,first we present some definitions. For binary words w(1), . . . ,

w(k) of length m, we define

E(w(1), . . . , w(k)) ={

y ∈ {0, 1}m∣∣∣∣y j ∈

{w

(1)j , . . . , w

(k)j

}for every j ∈ [m]

}

(2)

and call it the envelope of w(1), . . . , w(k). On the other hand,we define

E(w(1), . . . , w(k))

={

y ∈ {0, 1, ?}m | if w(1)j = w

(2)j = · · · = w

(k)j

then y j = w(1)j

}(3)

and call it the extended envelope of w(1), . . . , w(k). Notethat E(w(1), . . . , w(k)) ∩ {0, 1}m = E(w(1), . . . , w(k)). Inthis article, we put the following standard assumption calledMarking Assumption [5]:

Definition 2 The Marking Assumption states the following:If UP = {i1, i2, . . . , ik}, then the attack word y belongs to the

extended envelope E(wi1, wi2 , . . . , wik ) of codewords for thepirates.

For j ∈ [m], j-th column in codewords is called unde-tectable if j-th bits wi, j of the codewords wi for pirates i ∈UP coincide with each other; otherwise the column is calleddetectable. Then the Marking Assumption can be rephrasedas follows: For the attack word y, for every undetectable col-umn j , we have y j = wi, j for some (or equivalently, all)i ∈ UP (while y j may be freely chosen for detectable col-umns j).

We say that a fingerprint code C is collusion-secure if theerror probability of C is sufficiently small for any Reg andρ under Marking Assumption. More precisely, we say thatC is c-secure (with ε-error) [5] if the error probability isnot higher than a sufficiently small value ε under MarkingAssumption.

3 Our 3-secure codes

Here, we propose a codeword generation algorithm Gen anda tracing algorithm Tr for 3-secure codes (c = 3). The secu-rity property will be discussed below.

3.1 Overall structure

The algorithm Gen, with parameter 1/2 ≤ p < 1, is thecodeword generation algorithm of Tardos codes [24], but theprobability distribution of biases is different: For each (say,j-th) column, each user’s bit wi, j is independently chosenby Pr [wi, j = 1] = p j , where p j = p or 1 − p with prob-ability 1/2 each. Then Gen outputs W = (wi, j )i∈[N ], j∈[m]and st = (p j ) j∈[m].

To describe the algorithm Tr, we introduce some notation.For a binary word y of length m and a collection W = (wi, j )

of codewords of users, we define

Par(y) = {i1i2i3 ⊆ U | y ∈ E(wi1 , wi2 , wi3)} (4)

(see Sect. 1.3 for the notation i1i2i3). A key property impliedby Marking Assumption is that if the attack word y containsno erasure symbols, then y belongs to the envelope of thecodewords of pirates and, if furthermore |UP| = 3, the fam-ily Par(y) contains the set of three pirates (here, “Par” standsfor “Parents”, which is motivated from the property that ifi1i2i3 ∈ Par(y), then y can be generated from three wordswi1, wi2 and wi3 according to Marking Assumption).

The overall structure of our tracing algorithm Tr, withwords y, w1, . . . , wN and state information st = (p j ) j∈[m]as input, is the following, where Tr1 and Tr2 are subroutinesdefined later:

1. Replace each erasure symbol “?” in y with “0” or “1”independently in the following manner. If y j = ?, then

123

Page 4: Short collusion-secure fingerprint codes against three pirates

88 K. Nuida

it is replaced with “1” with probability p j , and with “0”with probability 1− p j . Let y′ denote the resulting word.

2. Execute the subroutine Tr1, with y′, w1, . . . , wN andst = (p j ) j∈[m] as input. If the output of Tr1 is empty, thengo to the next step. Otherwise, output the (non-empty)output of Tr1 and halt.

3. Execute the subroutine Tr2, with y′ and w1, . . . , wN asinput. Then output the (possibly empty) output of Tr2

and halt.

Roughly speaking, the first subroutine Tr1 plays a roleof coarse filter, which defies “unbalanced” pirate strategies.Namely, if the contribution of a pirate’s codeword to gener-ate the attack word y is much larger than other pirate’s code-word, then such a pirate will be correctly accused by Tr1 withoverwhelming probability; while any innocent user will beunlikely to be falsely accused here. Then, after the coarsefiltering, the second subroutine Tr2 performs more refinedtracing, which is the main part of our tracing algorithm. Infact, this subroutine will fail when the pirate strategy is toounbalanced in the above sense; this is the reason why theauxiliary tracing process Tr1 is introduced before the mainpart Tr2.

We notice that Tr1 is a kind of so-called single decoders,while Tr2 is a kind of so-called joint decoders. For the casep = 1/2 of our codeword generation algorithm, it is knownthat the “minority vote” by three pirates for generating theattack word y cancels the mutual information between y anda single codeword, therefore the pirates are likely to escapefrom the single decoder Tr1. However, even by such a strat-egy the pirates are unlikely to escape from the joint decoderTr2, as collections of users rather than individual users areconsidered there.

In the above algorithm, an arbitrary attack word y, whichmay contain erasure symbols, is mapped in Step 1 to anotherword y′ with no erasure symbols, for the purpose of sim-plifying the subsequent tracing process. As the informationon the erasure symbols in y is missed during this replace-ment, one may think that the tracing performance can beimproved by making use of the missing information on theerasure symbols. However, note that the modified binaryattack word y′ generated in Step 1 of the tracing algorithmcan be generated as well by some pirate strategy not usingerasure symbols. This implies that the highest error proba-bility of the present tracing algorithm can be achieved bya pirate strategy not using erasure symbols; therefore, evenif we can improve the performance of the tracing algorithmagainst pirate strategies using erasure symbols, it does notimprove the performance against the best pirate strategy(which will not use erasure symbols). Hence, from the view-point of simplicity, we prefer to design the tracing algorithmby ignoring information on the erasure symbols, without any

decrease in the tracing performance against the best piratestrategy.

3.2 The first subroutine

We describe the first subroutine Tr1, with binary wordsy′, w1, . . . , wN and state information st = (p j ) j∈[m] asinput, as follows:

1. Calculate a threshold parameter Z = Z y′ as specifiedbelow.

2. For each i ∈ U , calculate the score S(i) of i by

S(i) =∑

j∈[m]y′j=1

δwi, j ,y′j log1

p j

+∑

j∈[m]y′j=0

δwi, j ,y′j log1

1− p j. (5)

3. For each i ∈ U do the following: Check if S(i) ≥ Z ,and if it is satisfied, then output i .

Roughly speaking, if some pirates’ codewords contributeto generate the attack word at too many columns than theother pirates’ codewords, then it is very likely that scoresof such pirates exceed the threshold and they are correctlyaccused by Step 3. Hence this process defies “unbalanced”pirate strategies in the above sense.

Note that our scoring function (5) is different from theones for Tardos codes [24] and its symmetrized version [21]that have been shown to provide significantly good perfor-mance. The choice of our scoring function is just due to atechnical reason; in our tracing algorithm Tr the second sub-routine Tr2 is executed only when the scores of all usersare lower than the threshold, and the shape of our scoringfunction makes it easier to theoretically evaluate the errorprobability of Tr2 conditioned on such lower scores. Thereis a possibility that the true error probability of our tracingalgorithm Tr is reduced by applying the conventional scor-ing functions and another novel evaluation technique enablesus to prove a (better) bound of the reduced error probabil-ity. Improvements of our result in this way will be a futureresearch topic.

The threshold parameter Z = Z y′ in Step 1 is determinedas follows. Let AH be the set of column indices j such that(p j , y′j ) = (p, 1) or (1− p, 0), that is, the occurrence prob-ability of the bit y′j ∈ {0, 1} at j-th column is p ≥ 1/2, andlet AL = [m] \ AH . Put aH = |AH | and aL = |AL |. Choosea parameter ε0 > 0, which is smaller than the desired boundε of error probability. Then choose Z = Z y′ satisfying the

123

Page 5: Short collusion-secure fingerprint codes against three pirates

Short collusion-secure fingerprint codes 89

following condition:

kH ,kL

(aL

kL

)

paL−kL (1− p)kL

(aH

kH

)

pkH (1− p)aH−kH ≤ ε0

N,

(6)

where the sum runs over all integers kH , kL ≥ 0 such thatkH log 1

p + kL log 11−p ≥ Z . An example of a concrete

choice of Z satisfying the condition (6) is as follows:

Z0 = aH p log1

p+aL(1− p) log

1

1− p

+√√√√ 1

2

((

log1

p

)2

aH+(

log1

1− p

)2

aL

)

logN

ε0

(7)

(see Sect. 5.1 for the proof). From now, we suppose that thethreshold Z satisfies the condition (6) and Z ≤ Z0.

Note that when p = 1/2, the score S(i) of a user i isequal to log 2 times the number of columns in which thewords wi and y′ coincide. Hence the calculation of scorescan be made easier by using the “normalized” score S(i) =S(i)/ log 2 instead, which is equal to m minus the Hammingdistance ofwi from y′, together with the “normalized” thresh-old Z0/ log 2 = m/2+√

(m/2) log(N/ε0).

3.3 The second subroutine

We describe the second subroutine Tr2, with binary words y′and w1, . . . , wN as input, as follows:

1. Calculate the set Par′ = {T ∈ Par(y′) | T ∩ T ′ �=∅ for every T ′ ∈ Par(y′)}. If Par′ = ∅, then halt (withempty output).

2. If⋂

Par′ �= ∅, then output every member of⋂

Par′ andhalt.

3. Calculate the set P = {i1i2 ⊆ U | i1i2 ∩ T �=∅ for every T ∈ Par′}. Let Pk be the set of all i ∈ Usuch that |{P ∈ P | i ∈ P}| = k.

4. If P1 �= ∅, then do the following and halt (otherwise goto the next step): For each i ∈⋃ P , check if i i ′ ∈ P forsome i ′ ∈ P1, and if it is satisfied, then output i .

5. If |P| = 7, then do the following and halt (otherwise goto the next step): For each i ∈⋃ P , check if i i ′ ∈ P forsome i ′ ∈ P2, and if it is satisfied, then output i .

6. If |P| = 6, then output every i ∈ P3 and halt.7. If |P| = 5 and Par′′ �= ∅, where Par′′ is defined by

Par′′ = {i1i2i3 ∈ Par′ | i1i2, i2i3, i1i3 ∈ P} , (8)

then output every member of P2 ∩ (⋃

Par′′) and halt.8. If |P| = 5 and Par′′ = ∅, where Par′′ is defined by

(8), then do the following and halt (otherwise go to the

next step): For each i ∈⋃ P , check if i i ′ �∈ P for somei ′ ∈⋃ P , and if it is satisfied, then output i .

9. If |P| = 4, then do the following and halt (otherwise goto the next step): For each i ∈ ⋃ P , check if i ∈ T forevery T ∈ Par′ such that T ⊆⋃ P , and if it is satisfied,then output i .

10. If |P| = 3, then output every i ∈⋃ P and halt.11. Output nobody and halt.

In this and the next paragraphs, we give some explana-tion of the construction of the algorithm. The first insightis that, given a “binarized” attack word y′ under MarkingAssumption, the triple of pirates (when there are preciselythree pirates) belongs to the set Par(y′), while any triple ofinnocent users is very unlikely to be a member of Par(y′) (asshown in later sections). Therefore, the triple of pirates willbe a member of the set Par′ with overwhelming probability,which gives a condition to significantly refine the candidatesof pirates. This process is represented as Steps 1 and 2.

When the algorithm did not halt until Step 2, next weintend to determine a pirate by investigating the detailed“structure” of the set Par′. Roughly speaking, it will be shownin later sections that Par′ is very unlikely to involve certain“symmetric” substructures (see “Type III error” and “TypeIV error” in Sect. 4), and the resulting asymmetry of Par′gives us a key to determine a pirate. Such possible asym-metric structures of Par′ can even be classified into a fewdozens of patterns in a certain manner, but it is space-con-suming to enumerate them and specify an appropriate outputfor each pattern in a case-by-case manner. (Some examplesof the possibilities of Par′ are given in Fig. 1, where 1, 2, 3are the pirates, i j are innocent users and the members of Par′are denoted by triangles.) Moreover, it also takes non-negli-gible computing cost to determine, from given data of Par′,to which of the enumerated patterns the present Par′ belongs.Instead, we prefer to give a somewhat artificial but explicitalgorithm (Steps 3–11) to determine a suitable output directlyfrom the set Par′, which is much less space-consuming thanthe complete enumeration.

Here, we give a brief discussion on computational com-plexity of the algorithm Tr2, in particular, of calculation ofthe set Par′ from Par(y′). We consider the case p = 1/2 forsimplicity. The computational complexity depends mainlyon the size of the set Par(y′), which is of order �(N 3) in theworst case. Nevertheless, it can be significantly reduced inaverage case. First, as we have mentioned above, it holds withoverwhelming probability that every member of Par(y′) hasnon-empty intersection with the set of the three pirates. Thenumber of members of Par(y′) that consists of two piratesand one innocent user is at most 3(N−3). On the other hand,for each pirate i ∈ UP, we may assume that the words wi

and y′ differ at m − Z0/ log 2 = m/2−√(m/2) log(N/ε0)

or more columns, as otherwise the tracing algorithm Tr

123

Page 6: Short collusion-secure fingerprint codes against three pirates

90 K. Nuida

Fig. 1 Examples of the sets Par′ and P

should have halted at the first part Tr1 by outputting i . Asthe codewords of innocent users are chosen independentlyand uniformly at random from all m-bit binary words, it fol-lows that for two distinct innocent users i1 and i2, we havei i1i2 ∈ Par(y′) with probability at most (3/4)m−Z0/ log 2,therefore the expected number of members of Par(y′) thatconsist of one pirate and two innocent users is at most3(N−3

2

)(3/4)m−Z0/ log 2. This number is not too large in prac-

tical cases, as m should not be too small in order to keep thesecurity [see (12) below]. This argument suggests that thecalculation of Par′ from Par(y′) is less dominant than cal-culation of Par(y′) itself, the latter seeming to require �(N 3)

computational complexity in a naive manner.

3.4 The security

For the security of the proposed fingerprint code, first wepresent the following result, which will be proven in Sect. 4:

Theorem 1 By the above choice of ε0 and Z, if the numberof pirates is three, then the error probability of the proposedfingerprint code is lower than

ε0 +(

N − 3

3

)

f1(p)m + 3(N − 3)(N − 4) f2(p)m

+(N − 3)(1− p)−3√

(m/2) log(N/ε0) f3(p)m , (9)

where we put

f1(p) = 1− 3p2 + 10p3 − 15p4 + 12p5 − 4p6 ,

f2(p) = p2(1− p)2(√

p +√1− p)

+ 1− p − p2 + 4p3 − 2p4 ,

f3(p) = p4−3p(p2 − 3p + 3)+ (1− p)3p+1

× (p2 + p + 1).

(10)

Some numerical analysis suggests that the choice p = 1/2would be optimal (or at least pretty good) to decrease thebound of error probabilities specified in Theorem 1. In fact, anelementary analysis shows that the second term

(N−33

)f1(p)m

in the sum, which seems dominant (cf., Theorem 2 below),takes the minimum over p ∈ [1/2, 1) at p = 1/2. Hence,we use p = 1/2 in the following argument. Now, it is shownthat the error probability against less than three pirates alsohas the same bound under a condition (12) below (whichseems trivial in practical situations), therefore we have thefollowing (which will be proven in Sect. 4):

Theorem 2 By using the value p = 1/2, the proposed fin-gerprint code is 3-secure with error probability lower than

ε0 +(

N − 3

3

) (7

8

)m

+ 3(N−3)(N−4)

(10+√2

16

)m

+ (N − 3)8√

(m/2) log(N/ε0)

(7√

2

16

)m

(11)

provided

m ≥ 8 logN

ε0

(

1+ 1

16 log(N/ε0)

)2

. (12)

Here, we give a remark on a relation between false-neg-ative and false-positive in the tracing algorithm. It will beshown in Sect. 4 and later sections that, in the bounds (9)and (11) of error probabilities, the first term ε0 correspondsto false-positive, while the second term corresponds to false-negative (the other terms are relevant to both of false-positiveand false-negative). As the second term would be dominant(unless ε0 is set to be too large), the probability of false-neg-ative would be dominant among the whole error probability.This means that if some application requires the probabilityof false-positive to be very low while it allows the probabil-ity of false-negative to be relatively high, it is expected thatthe code length can be reduced further in such a situation.This relation between false-negative and false-positive seemspractically desirable, as usually false-positive is regarded asbeing more crucial than false-negative.

3.5 Comparison of code lengths with other codes

Table 1 shows comparison of our code lengths (numeri-cally calculated by using Theorem 2) with 3-secure codes

123

Page 7: Short collusion-secure fingerprint codes against three pirates

Short collusion-secure fingerprint codes 91

Table 1 Comparison of code lengths with the codes by Schaathun [18]

N 213 215 218 221 240

ε 10−25 10−75 10−16 10−53 10−148

[18] 57,337 229,369 57,330 114,681 458,745

Ours 890 2,430 662 1,815 4,880

Ratio (%) 1.55 1.06 1.15 1.58 1.06

Table 2 Comparison of code lengths with the codes by Nuida et al.[15] (c = 3)

N 300 109 106

ε 10−11 10−6 10−3

[15] 1,309 1,423 877

Ours 420 561 354

Ratio (%) 32.1 39.4 40.4

by Schaathun [18]. Table 2 shows the comparison with c-secure codes by Nuida et al. [15] for c = 3. The values of Nand ε and the corresponding code lengths are chosen fromthose articles. For the calculation of our code length, here wechoose the parameter ε0 = N exp(−α0m), where

α0 = log16

7√

2+ (log 8)2

4− log 8

4

8 log16

7√

2+ (log 8)2

≈ 0.075658 . . . (13)

This choice of ε0 is aimed at making the first and the fourthterms in (11) almost equal, which can reduce the computa-tional time of this experiment. Although this choice of ε0 ismotivated by efficient calculation and may be not the bestfrom the viewpoint of reducing the code lengths, the tablesshow that our code lengths derived by using these ε0 are stillmuch shorter than the codes in [18], and even significantlyshorter than the codes in [15]. On the other hand, recently Ki-tagawa et al. [10] proposed another construction of 3-securecodes and evaluated the security against some typical piratestrategies in the case N = 100 and m = 128 by computerexperiment. The resulting error probability was ε = 0.009.For the same error probability, our code length (with theabove parameter ε0) is m = 139. Therefore, our code, whichis provably secure in contrast to their code, has almost thesame length as their code.

We also give graphs of code lengths of the proposed 3-secure codes for various error probabilities (Fig. 2) and forvarious user numbers (Fig. 3). The graphs suggest that thebehavior of code lengths is almost logarithmic in user num-bers and in error probabilities.

Note that the other 3-secure codes over binary alphabetwhich are referred to in this article are either already hav-ing much longer code lengths than the codes in [15] ([5,24]),or mainly studied from asymptotic viewpoints and given nei-ther self-contained formula of error probability nor numerical

150

200

250

300

350

400

450

500

550

600

10--610--510--410--310--2

code

leng

th

error probability

N = 103 N = 106 N = 109

Fig. 2 Code lengths of the proposed 3-secure codes for various errorprobabilities

200

250

300

350

400

450

500

550

102 103 104 105 106

code

leng

th

number of users

error prob. = 10-5

error prob. = 10-8

error prob. = 10-11

Fig. 3 Code lengths of the proposed 3-secure codes for various usernumbers

examples for practical cases [1,2,4,8,9,21,23]. The compar-isons with the former codes are omitted, as it has been shownabove that our code length is significantly shorter than [15].In order to perform comparisons with the latter codes, herewe show some asymptotic properties of our codes. First, forthe rate of our proposed codes, we have the following result,which will be proven in Sect. 5.8:

Theorem 3 We use the value p = 1/2. Then, for any positivenumber R such that

R < R0 = 1− log 7

3 log 2≈ 0.064215 · · · (14)

the error probability of our proposed 3-secure code with codelength m and user number N = 2m R� approaches to 0exponentially in m, where the parameter ε0 is set to be ε0 =exp(−αm) with positive constant α satisfying

α <2

9(log 2)2

(

R0 log 2+ log

(7√

2

16

))2

− R0 log 2

≈ 0.043250 · · · . (15)

123

Page 8: Short collusion-secure fingerprint codes against three pirates

92 K. Nuida

By the theorem, the achievable rate of our 3-secure code isat least R0. Regarding preceding results on the achievablerates of 3-secure codes, the rate of c-secure codes by Barg,Blakley and Kabatiansky [2] in the case c = 3, which isshown in Corollary 5.2 of [2], is at most R(s)

c /c3 = R(s)c /27,

where R(s)c denotes (as in [2]) the maximum achievable rate

of (c, c)-separating codes. In particular, this rate cannot behigher than 1/27 ≈ 0.037037 . . ., therefore the rate of ourcode is significantly higher than that in [2]. On the otherhand, the c-secure codes by Amiri and Tardos [1] have, inthe case c = 3, achievable rate 0.0975 (as shown in Table 1of [1]), which is much higher than our codes. However, theirtracing algorithm would require much more computationaltime than our algorithm and therefore be not practical, asit seems that their algorithm needs to solve certain convexoptimization problem associated with each triple of users.Moreover, it is claimed in Fig. 2(a) of [9] by Huang andMoulin (which is a full version of [8]) that their c-securecodes in the case c = 3 have achievable rate higher than (or atleast the same as) our rate. However, their argument restrictsthe pirate strategies to some “symmetric” ones, therefore it isnot yet guaranteed that their codes have sufficiently low errorprobabilities in practical cases against arbitrary pirate strate-gies under Marking Assumption. It is a future research topicto construct 3-secure codes that are provably secure withconcrete bounds of error probabilities, have efficient tracingalgorithms, and have achievable rates close to the latter tworesults.

Secondly, for the asymptotic behavior of our code lengths,we have the following result, which will be proven inSect. 5.9:

Theorem 4 We use the value p = 1/2, and put ε0 =N exp(−3m/40). If 0 < ε ≤ 1/3, then the code length m ofour 3-secure code with error probability ε is shorter than orequal to m = 9K log(N/ε)�, where

K = 1

9R0 log 2≈ 2.4963 . . . (16)

[see (14) for the definition of R0].

This result shows that the ratio m/(c2 log(N/ε)), which hasbeen used in the literature to compare the code lengths inasymptotic cases, is approximately 2.50 for our 3-securecodes. This value is significantly smaller than the theoret-ically proven asymptotic ratios in [21] (i.e., π2 ≈ 9.87) andin [23] (i.e., 2π2 ≈ 19.7), and the ratio shown in Fig. 1 of [4](i.e., over 25 for c = 3). Our ratio is also significantly smallerthan the asymptotic ratioπ2/2 ≈ 4.93 in [21] under an exper-imentally verified assumption on approximation based onCentral Limit Theorem. This suggests that our code lengthis much shorter than the 3-secure codes in [4,21,23].

4 Security proof

In this section, we present an outline of the proof of Theo-rems 1 and 2. Omitted details of the proof will be suppliedin Sect. 5.

First, we present some properties of the threshold param-eter Z = Z y′ , which will be proven in Sect. 5.1:

Proposition 1 1. If Z satisfies the condition (6), then theconditional probability that S(I) ≥ Z for some I ∈ UI,conditioned on the choice of y′, is not higher than (N −1)ε0/N.

2. The value Z = Z0 in (7) satisfies the condition (6).

To prove Theorem 1, we consider the case that the num-ber of pirates |UP| is three. By symmetry, we may assumethat UP = {1, 2, 3}. Put TP = 123, therefore we have TP ∈Par(y′) by Marking Assumption. Now, we consider the fol-lowing four kinds of events:

Type I error: S(I) ≥ Z for some innocent user I ∈ UI.Type II error: T ∩ TP = ∅ for some T ∈ Par(y′).Type III error: There are T1, T2 ∈ Par(y′) such that ∅ �=T1 ∩ T2 ⊆ UI, |T1 ∩ TP| = 1 and |T2 ∩ TP| = 1.Type IV error: S(i) < Z for every i ∈ {1, 2, 3}, and there isan innocent user I such that 12I ∈ Par(y′), 13I ∈ Par(y′)and 23I ∈ Par(y′).

Then, we have the following property, which will be provenin Sect. 5.2:

Proposition 2 If |UP| = 3, then tracing error occurs onlywhen one of the Type I, II, III and IV errors occurs.

By this proposition, the error probability is bounded bythe sum of the probabilities of Type I–IV errors. By Proposi-tion 1, the probability of Type I error is bounded by ε0. Now,Theorem 1 is proven by combining this with the followingthree propositions, which will be proven in Sects. 5.3, 5.4and 5.5, respectively [see (10) for the notations]:

Proposition 3 If |UP| = 3, then the probability of Type IIerror is not higher than

(N−33

)f1(p)m.

Proposition 4 If |UP| = 3, then the probability of Type IIIerror is not higher than 3(N − 3)(N − 4) f2(p)m.

Proposition 5 If |UP| = 3 and the threshold Z is chosen sothat the condition (6) holds and Z ≤ Z0, then the probabilityof Type IV error is lower than

(N − 3)(1− p)−3√

(m/2) log(N/ε0) f3(p)m . (17)

Note that Type I error mainly corresponds to false-positive, while Type II error causes false-negative only. TypeIII and Type IV errors are relevant to both false-positive and

123

Page 9: Short collusion-secure fingerprint codes against three pirates

Short collusion-secure fingerprint codes 93

false-negative. As the probability of Type II error seems dom-inant among the four kinds of errors (unless ε0 is too large), itfollows that the whole error probability would mainly consistof that of false-negative.

To prove Theorem 2, we set p = 1/2. Then the boundof error probability given by Theorem 1 is specialized to thevalue specified in Theorem 2. Hence our remaining task is toevaluate the error probabilities for the case that the numberof pirates is one or two.

First, we consider the case that there are exactly twopirates, say, 1, 2 ∈ U . The key property is the following,which will be proven in Sect. 5.6:

Proposition 6 In this situation, if the condition (12) is sat-isfied, then the probability that S(1) < Z and S(2) < Z islower than ε0/N.

By this proposition, when the condition (12) is satisfied,at least one of the two pirates is output in Step 3 of the tracingalgorithm with probability not lower than 1− ε0/N . On theother hand, by Proposition 1, some innocent user is output inStep 3 with probability not higher than (N −1)ε0/N . Hencein Step 3, at least one pirate and no innocent users are outputwith probability not lower than 1− ε0. This implies that theerror probability is bounded by ε0 in this case.

Secondly, we consider the case that there is exactly onepirate, say, 1 ∈ U . Then, we have the following property,which will be proven in Sect. 5.7:

Proposition 7 In this situation, if m ≥ 2 log(N/ε0), thenthe score S(1) of the pirate is always higher than or equalto Z.

By this proposition, when the condition (12) is satisfied,the pirate is always output in Step 3 of the tracing algorithm.Hence by the same argument as the previous paragraph, theerror probability is bounded by ε0 in this case as well. Sum-marizing, the proof of Theorem 2 is concluded.

5 Proofs of the propositions

5.1 Proof of Proposition 1

First, we prove the claim 1 of Proposition 1. For each I ∈ UI

and σ ∈ {H, L}, let Kσ = { j ∈ Aσ | wI, j = y′j }. Then,we have S(I) = |K H | log(1/p)+ |KL | log(1/(1− p)). Nownote that the choice of y′ is independent of wI. This impliesthat we have Pr [wI, j = y′j | y′] = p for each j ∈ AH , andwe have Pr [wI, j = y′j | y′] = 1− p for each j ∈ AL . Hencethe conditional probability that |K H | = kH and |KL | = kL ,conditioned on this y′, is

(aLkL

)(1− p)kL paL−kL

(aHkH

)pkH (1−

p)aH−kH . This implies that Pr [S(I) ≥ Z | y′] is equal tothe left-hand side of (6), therefore the claim 1 holds as thereexist at most N − 1 innocent users I.

Secondly, to prove the claim 2 of Proposition 1, we usethe following Hoeffding’s Inequality:

Theorem 5 ([7], Theorem 2) Let X1, X2, . . . , Xn be inde-pendent random variables such that ai ≤ Xi ≤ bi for eachi . Let X be the average value of X1, . . . , Xn. Then for t > 0,we have

Pr [X − E[X ] ≥ t] ≤ exp

( −2n2t2∑n

i=1(bi − ai )2

)

. (18)

As mentioned above, the left-hand side of (6) is equal toPr [S(I) ≥ Z | y′], where I is any specified innocent user.Now for each j ∈ [m], let X j be a random variable such that{

Pr [X j = log(1/p)] = p

Pr [X j = 0] = 1− pif j ∈ AH ,

{Pr [X j = log(1/(1− p))] = 1− p

Pr [X j = 0] = pif j ∈ AL .

(19)

Then, conditioned on this y′, the variables X1, . . . , Xm areindependent and S(I) = m X . Now by a direct calculation,we have E[S(I) | y′] = m E[X | y′] = μ where μ =aH p log(1/p) + aL(1 − p) log(1/(1 − p)). Moreover, wehave 0 ≤ X j ≤ log(1/p) if j ∈ AH , and we have 0 ≤ X j ≤log(1/(1− p)) if j ∈ AL . Hence Theorem 5 implies that

Pr [S(I)− μ ≥ mt | y′]≤ exp

( −2m2t2

aH (log(1/p))2 + aL(log(1/(1− p)))2

)

(20)

for t > 0. Now by setting t = η/m where

η =√√√√1

2

((

log1

p

)2

aH +(

log1

1− p

)2

aL

)

logN

ε0,

(21)

the right-hand side of (20) is equal to ε0/N . On the otherhand, for the left-hand side of (20), we have

Pr [S(I)− μ ≥ mt | y′] = Pr [S(I) ≥ μ+ η | y′] , (22)

while the value of Z = Z0 in (7) is equal to μ + η. Hencethe condition (6) is satisfied, concluding the proof of Propo-sition 1.

5.2 Proof of Proposition 2

To prove Proposition 2, suppose that it is not the case of TypeI–IV errors. We show that tracing error does not occur in thiscase. Recall that TP = 123 ∈ Par(y′). By the absence ofType I error, it holds that either some pirate and no inno-cent users are output in Step 3 of Tr, or S(i) < Z for everyi ∈ U and nobody is output in Step 3. It suffices to con-sider the latter case. We have TP ∈ Par′ by the absenceof Type II error. Hence every T ∈ Par′ intersects TP, and

123

Page 10: Short collusion-secure fingerprint codes against three pirates

94 K. Nuida

⋂Par′ ⊆ TP. By virtue of Step 2, it suffices to consider

the case that⋂

Par′ = ∅. Now, there are the following twocases: (A) we have |T ∩ TP| = 1 for some T ∈ Par′; (B) wehave |T ∩ TP| = 2 for every T ∈ Par′ \ {TP}.

5.2.1 Case (A)

Let T1 ∈ Par′ and |T1 ∩ TP| = 1. By symmetry, we mayassume that T1 ∩ TP = {1}. By the fact

⋂Par′ = ∅, there

is a T2 ∈ Par′ such that 1 �∈ T2. We may assume by sym-metry that 2 ∈ T2, as T2 ∩ TP �= ∅. We have T1 ∩ T2 �= ∅as T1 ∈ Par′, therefore the absence of Type III error impliesthat 3 ∈ T2. Put T2 = 23I with I ∈ UI, and T1 = 1II′ withI′ ∈ UI. Now if we calculate the set P by using {TP, T1, T2}instead of Par′, then the result is

{12, 13, 1I, 2I, 2I′, 3I, 3I′} . (23)

In general, the actual set P is included in the set (23). Now,we present two properties. First, we show that 12, 13 ∈ P .Indeed, if 12 �∈ P , then we have 12 ∩ T = ∅ for somet ∈ Par′. Now, we have 3 ∈ T and T1 ∩ T �= ∅ as T ∈ Par′,therefore T1 and T contradict the absence of Type III error.Hence we have 12 ∈ P , and 13 ∈ P by symmetry. Sec-ondly, we show that no innocent users are output in Step 4.Indeed, if an I′′ ∈ UI is output in Step 4, then the possibilityof P mentioned above implies that I′′ ∈ {I, I′} and we havei ∈ P1 and i I′′ ∈ P for some i ∈ 123. This is impossible,as 12, 13 ∈ P . Hence this claim holds, therefore it sufficesto consider the case that nobody is output in Step 4, namelyP1 = ∅.

By these properties, we have either 2I′, 3I′ ∈ P or 2I′, 3I′ �∈P (otherwise I′ ∈ P1, a contradiction). Similarly, we haveP ∩ {2I, 2I′} �= ∅ and P ∩ {3I, 3I′} �= ∅. First, we considerthe case that 2I′, 3I′ ∈ P . As P1 = ∅, it does not hold that|P ∩ {1I, 2I, 3I}| �= 1. If 1I, 2I, 3I ∈ P , then |P| = 7,P2 ={I′}, and 2 and 3 are output in Step 5. If |P ∩ {1I, 2I, 3I}| =2, then |P| = 6,∅ �= P3 ⊆ UP and a pirate is correctlyoutput in Step 6. Finally, if 1I, 2I, 3I �∈ P , then |P| = 4and P = {12, 13, 2I′, 3I′}. Now, I′ is not output in Step 9, as123 ∈ Par′. Moreover, if none of 1, 2, and 3 is output in Step9, then it should hold that 12I′, 13I′, 23I′ ∈ Par′, contradict-ing the absence of Type IV error. Hence a pirate is correctlyoutput in Step 9, concluding the proof in the case 2I′, 3I′ ∈ P .

Secondly, we suppose that 2I′, 3I′ �∈ P , therefore 2I, 3I ∈P . There are two possibilities P = {12, 13, 2I, 3I} and P ={12, 13, 1I, 2I, 3I}. The former case is the same as the previ-ous paragraph. In the latter case, we have |P| = 5, Par′′ ⊆{12I, 13I} and P2 = 23. Hence 2 or 3 is correctly output inStep 7 when Par′′ �= ∅. On the other hand, when Par′′ = ∅, 2and 3 are correctly output in Step 8. Hence the proof in thecase 2I′, 3I′ �∈ P [therefore in the case (A)] is concluded.

5.2.2 Case (B)

As⋂

Par′ = ∅, there are I1, I2, I3 ∈ UI such that 12I3, 13I2,23I1 ∈ Par′. By the absence of Type IV error, it does nothold that I1 = I2 = I3. By symmetry, we may assumethat I1 �= I2. Then by calculating the set P by using{123, 12I3, 13I2, 23I1} instead of Par′, it follows that theactual P satisfies P ⊆ {12, 13, 23, 1I1, 2I2, 3I3}, while12, 13, 23 ∈ P by the assumption of the case (B). If P ={12, 13, 23}, then 1, 2 and 3 are output in Step 10. Therefore,it suffices to consider the case that {12, 13, 23} � P .

If I1 �= I3 �= I2, then we have ∅ �= P1 ⊆ I1I2I3 and apirate is correctly output in Step 4. Hence it suffices to con-sider the remaining case. By symmetry, we may assume thatI1 = I3 �= I2. If 2I2 ∈ P , then we have I2 ∈ P1 ⊆ I1I2, and2 is correctly output in Step 4. From now, we assume that2I2 �∈ P . If 1I1 �∈ P or 3I1 �∈ P , then we have P1 = {I1}as {12, 13, 23} � P , therefore 1 or 3 is correctly output inStep 4. On the other hand, if 1I1, 3I1 ∈ P , then we have P ={12, 13, 23, 1I1, 3I1}, while 13I1 �∈ Par′ by the absence ofType IV error (note that 12I1, 23I1 ∈ Par′), therefore Par′′ ={123},P2 = 2I1 and 2 is correctly output in Step 7. Hencethe proof in the case (B), therefore the proof of Proposition2, is concluded.

5.3 Proof of Proposition 3

To prove Proposition 3, let I1, I2 and I3 be three distinct inno-cent users. Given y′ and st = (p j ) j , we introduce the fol-lowing notation for j ∈ [m]:

ξ Hj =

{1 if p j = p,

0 if p j = 1− p,ξ L

j = 1− ξ Hj . (24)

Note that the sets Aσ for σ ∈ {H, L} defined in Sect. 3 satisfythat Aσ = { j | y′j = ξσ

j }. We write Aσ = Aσ (y′, st) andaσ = |Aσ | = aσ (y′, st) when we emphasize the dependencyon y′ and st. Then, as the bits of codewords are independentlychosen, we have

Pr [I1I2I3 ∈ Par(y′) | y′, st]= (1− p3)aL (1− (1− p)3)aH , (25)

therefore

Pr [I1I2I3 ∈ Par(y′)] =∑

y′,st

Pr [y′, st](1− p3)aL (y′,st)

×(1− (1− p)3)aH (y′,st) . (26)

Now, we present the following key lemma, which will beproven later:

Lemma 1 Among the possible pirate strategies ρ, the max-imum value of the right-hand side of (26) is attained by themajority vote attack, namely the attack word y for codewords

123

Page 11: Short collusion-secure fingerprint codes against three pirates

Short collusion-secure fingerprint codes 95

w1, w2, w3 of three pirates satisfies that y j = 0 if at leasttwo of w1, j , w2, j , w3, j are 0 and y j = 1 otherwise.

If ρ is the majority vote attack, then for each j ∈ [m],we have j ∈ AH (y′, st) (i.e., ξ H

j becomes the majority

in w1, j , w2, j , w3, j ) with probability 3p2(1 − p) + p3 =3p2−2p3 and j ∈ AL(y′, st) with probability 1−3p2+2p3.This implies that

Pr [I1I2I3 ∈ Par(y′)]=

αL ,αHαL+αH=m

(

Pr [aL = αL , aH = αH ]

· (1− p3)αL (1− (1− p)3)αH

)

=∑

αL ,αHαL+αH=m

((m

αL

)

(1− 3p2 + 2p3)αL (3p2 − 2p3)αH

· (1− p3)αL (1− (1− p)3)αH

)

=∑

αL ,αHαL+αH=m

((m

αL

)

(1− 3p2 + p3 + 3p5 − 2p6)αL

· (9p3 − 15p4 + 9p5 − 2p6)αH

)

= (1− 3p2 + 10p3 − 15p4 + 12p5 − 4p6)m = f1(p)m .

(27)

By virtue of Lemma 1, for a general ρ, Pr [I1I2I3 ∈ Par(y′)]is bounded by the right-hand side of the above equality. Thisimplies the claim of Proposition 3, as there are

(N−33

)choices

of the triple I1, I2, I3.To complete the proof of Proposition 3, we give a proof

of Lemma 1.

Proof of Lemma 1 Fix the codewords w1, w2, w3 of thethree pirates 1, 2, 3 ∈ U . Let wP denote the collection ofthose three codewords. Let j0 ∈ [m] be the index of a detect-able column. By symmetry, we may assume without loss ofgenerality that w1, j0 = w2, j0 = 0 and w3, j0 = 1. Now lety0 be an arbitrary attack word such that y0

j0= 0, and let y1

and y? be the attack words obtained from y0 by changingthe j0-th column to 1 and to ?, respectively. We show thatif the pirate strategy ρ for the input wp is modified so thatit outputs y0 instead of y1 and y?, then the right-hand sideof (26) will not decrease. As wP, j0 and y0 are arbitrarilychosen, the claim of Lemma 1 then follows.

Let y′0 be an m-bit word such that y′0j = y0j for any

j ∈ [m] with y0j �= ?, therefore y′0 is obtained from y0 in

Step 1 in the tracing algorithm with positive probability. Lety′1 be the m-bit word obtained from y′0 by changing the j0-thcolumn to 1. Moreover, let st0 = (p j ) j be any state informa-

tion such that p j0 = 1− p, and let st1 be the state informationobtained from st0 by changing the j0-th component to p.

In this case, by independence of the columns, we havePr [wP | st0] = αp2(1− p) and Pr [wP | st1] = αp(1− p)2

for a common α > 0. As Pr [st0] = Pr [st1] > 0 andPr [wP] > 0, Bayes Theorem implies that Pr [st0 | wP] =α′ p2(1− p) and Pr [st1 | wP] = α′ p(1− p)2 for a commonα′ > 0, therefore

Pr [st0 | wP, (st0 or st1) ]= α′ p2(1− p)

α′ p2(1− p)+ α′ p(1− p)2 = p (28)

and Pr [st1 | wP, (st0 or st1) ] = 1 − p. Now there is acommon β > 0 such that, for each x ∈ {0, 1},Pr [y′0 | stx , y0] = Pr [y′1 | stx , y1] = β ,

Pr [y′0 | stx , y1] = Pr [y′1 | stx , y0] = 0 ,

Pr [y′0 | st0, y?] = Pr [y′1 | st1, y?] = βp ,

Pr [y′0 | st1, y?] = Pr [y′1 | st0, y?] = β(1− p) .

(29)

As the choice of the attack word y for given wP is indepen-dent of st, and the choice of the word y′ will be independentof wP once the attack word y is determined, it follows that

Pr [y′x , stx ′ | wP, (st0 or st1) , yx ′′ ]= Pr [stx ′ | wP, (st0 or st1) ]Pr [y′x | stx ′ , yx ′′ ] (30)

for x, x ′ ∈ {0, 1} and x ′′ ∈ {0, 1, ?}. By these relations, wehave

Pr [(y′0, st0) or (y′1, st1) | wP, (st0 or st1) , y0]= p · β + (1− p) · 0 = pβ ,

Pr [(y′1, st0) or (y′0, st1) | wP, (st0 or st1) , y0]= 1− pβ ,

Pr [(y′0, st0) or (y′1, st1) | wP, (st0 or st1) , y1]= p · 0+ (1− p) · β = (1− p)β ,

Pr [(y′1, st0) or (y′0, st1) | wP, (st0 or st1) , y1]= 1− (1− p)β ,

Pr [(y′0, st0) or (y′1, st1) | wP, (st0 or st1) , y?]= p · βp + (1− p) · βp = pβ ,

Pr [(y′1, st0) or (y′0, st1) | wP, (st0 or st1) , y?]= 1− pβ .

(31)

Now note that p ≥ 1/2, therefore we have 1 − p3 ≤ 1 −(1− p)3 and pβ ≥ (1− p)β. Note also that aH (y′0, st0) =aH (y′1, st1) = aH (y′0, st1) + 1 = aH (y′1, st0) + 1. Thisimplies that, in the case st ∈ {st0, st1}, if the pirate strategyρ for the input wp is modified in such a way that it outputs y0

instead of y1 and y?, then the right-hand side of (26) will notdecrease. As this property is in fact independent of the choiceof st0 and st1, the claim in the proof follows, concluding theproof of Lemma 1. ��

123

Page 12: Short collusion-secure fingerprint codes against three pirates

96 K. Nuida

5.4 Proof of Proposition 4

To prove Proposition 4, we fix an innocent user I0 ∈ UI andconsider the probability that there are T1, T2 ∈ Par(y′) suchthat I0 ∈ T1 ∩ T2 ⊆ UI, T1 ∩ TP = {1} and T2 ∩ TP = {2}; orequivalently, there are innocent users I1, I2 ∈ UI \ {I0} suchthat 1I0I1 ∈ Par(y′) and 2I0I2 ∈ Par(y′). We introduce somenotations. Given y′, w1, w2, wI0 , and st = (p j ) j , we define,for α, β, γ, δ ∈ {H, L},

aαβγ δ = |{ j ∈ [m] | y′j = ξαj ,

w1, j = ξβj , w2, j = ξ

γ

j , wI0, j = ξδj }| (32)

[see (24) for the notations]. Moreover, by using “∗” as awild-card, we extend naturally the definition of aαβγ δ to thecase α, β, γ, δ ∈ {H, L , ∗}. For example, we have aα∗∗δ =aαH Hδ + aαH Lδ + aαL Hδ + aαL Lδ . Note that ax∗∗∗ (x ∈{H, L}) is equal to the value ax in Sect. 3.

Now for an innocent user I1 �= I0, we have

Pr [1I0I1 ∈ Par(y′) | y′, w1, w2, wI0 , st]= paH L∗L (1− p)aL H∗H . (33)

Therefore, we have

Pr [1I0I1 ∈ Par(y′) for some I1 ∈ UI | y′, w1, w2, wI0 , st]≤ (N − 4)paH L∗L (1− p)aL H∗H (34)

as there are N − 4 choices of I1. Similarly, we have

Pr [2I0I2 ∈ Par(y′) for some I2 ∈ UI | y′, w1, w2, wI0 , st]≤ (N − 4)paH∗L L (1− p)aL∗H H . (35)

Hence the probability that 1I0I1, 2I0I2 ∈ Par(y′) for someI1, I2 ∈ UI, conditioned on the given y′, w1, w2, wI0 , andst, is lower than the minimum of the two values (N −4)paH L∗L (1−p)aL H∗H and (N−4)paH∗L L (1−p)aL∗H H , whichis not higher than

√(N−4)paH L∗L (1− p)aL H∗H · (N−4)paH∗L L (1− p)aL∗H H

= (N − 4)√

p aH L∗L+aH∗L L√

1− paL H∗H+aL∗H H

. (36)

Now given y′, w1, w2, and st, the probability that wI0attains the given values of aH L L L , aH L H L , aH H L L , aL H L H ,

aL H H H , and aL L H H (denoted here by η) is the product ofthe following six values

(aH L L∗aH L L L

)

(1− p)aH L L L paH L L∗−aH L L L , (37)

(aH L H∗aH L H L

)

(1− p)aH L H L paH L H∗−aH L H L , (38)

(aH H L∗aH H L L

)

(1− p)aH H L L paH H L∗−aH H L L , (39)

(aL H L∗aL H L H

)

paL H L H (1− p)aL H L∗−aL H L H , (40)

(aL H H∗aL H H H

)

paL H H H (1− p)aL H H∗−aL H H H , (41)

(aL L H∗aL L H H

)

paL L H H (1− p)aL L H∗−aL L H H . (42)

By the above results, it follows that

Pr [1I0I1, 2I0I2 ∈ Par(y′) for some I1, I2 ∈ UI | y′, w1,

w2, st] ≤∑(

η(N − 4)√

p 2aH L L L+aH L H L+aH H L L

·√1− paL L H H+aL H L H+2aL H H H

)

, (43)

where the sum runs over the possible values of aH L L L ,

aH L H L , aH H L L , aL H L H , aL H H H , and aL L H H . Now by theabove definition of η, the summand in the right-hand side isthe product of N − 4 and the following six values

(aH L L∗aH L L L

)

(1− p)aH L L L paH L L∗ , (44)

(aH L H∗aH L H L

)((1− p)

√p)aH L H L paH L H∗−aH L H L , (45)

(aH H L∗aH H L L

)((1− p)

√p)aH H L L paH H L∗−aH H L L , (46)

(aL H L∗aL H L H

) (p√

1− p)aL H L H

(1− p)aL H L∗−aL H L H , (47)

(aL H H∗aL H H H

)

paL H H H (1− p)aL H H∗ , (48)

(aL L H∗aL L H H

) (p√

1− p)aL L H H

(1− p)aL L H∗−aL L H H . (49)

Then by the binomial theorem, the sum is equal to

(N − 4) (p(2− p))aH L L∗ (p + (1− p)√

p)aH L H∗+aH H L∗

·(

1− p + p√

1− p)aL H L∗+aL L H∗

· ((1− p)(1+ p))aL H H∗ . (50)

Given y′, st, w1, w2, and w3, we define, for α, β, γ, δ ∈{H, L},

bαβγ δ = |{ j ∈ [m] | y′j = ξαj , w1, j = ξ

βj , w2, j = ξ

γ

j ,

w3, j = ξδj }| . (51)

123

Page 13: Short collusion-secure fingerprint codes against three pirates

Short collusion-secure fingerprint codes 97

Then by Marking Assumption, (50) is equal to

(N − 4)(2p − p2)bH L L H

· (p + (1− p)√

p)bH L H L+bH L H H+bH H L L+bH H L H

·(

1− p + p√

1− p)bL H L L+bL H L H+bL L H L+bL L H H

·(1− p2)bL H H L

= (N − 4)(2p − p2)bH L L H (1− p2)bL H H L

· (p + (1− p)√

p)bH L H L+bH H L H

·(

1− p + p√

1− p)bL H L H+bL L H L

· (p + (1− p)√

p)bH L H H+bH H L L

·(

1− p + p√

1− p)bL H L L+bL L H H

. (52)

By writing the right-hand side of (52) as η′, it follows that

Pr [1I0I1, 2I0I2 ∈ Par(y′) for some I1, I2 ∈ UI

| w1, w2, w3] ≤∑

y′,st

Pr [y′, st | w1, w2, w3]η′ . (53)

Now, we present the following key lemma, which will beproven later:

Lemma 2 Among the possible pirate strategies ρ, the maxi-mum value of the right-hand side of (53) is attained by major-ity vote attack ρmaj (cf., Lemma 1).

By (53), we have

Pr [1I0I1, 2I0I2 ∈ Par(y′) for some I1, I2 ∈ UI]≤

w1,w2,w3

Pr [w1, w2, w3]∑

y′,st

Pr [y′, st | w1, w2, w3]η′

=∑

y′,st,w1,w2,w3

Pr [y′, st, w1, w2, w3]η′ . (54)

By virtue of Lemma 2, the maximum value of the right-hand side is attained by majority vote attack ρmaj. Now forρ = ρmaj, the word y′ is uniquely determined by w1, w2, andw3, and we have bH L L H = bL H H L = bH L H L = bL H L H =bH H L L = bL L H H = 0, bH H L H = dH L H , bL L H L =dL H L , bH L H H = dL H H , and bL H L L = dH L L , where, forα, β, γ ∈ {H, L},dαβγ = |{ j ∈ [m] | w1, j = ξα

j , w2, j = ξβj , w3, j = ξ

γ

j }| .(55)

This implies that

η′ = (N − 4)(

p + (1− p)√

p)dL H H+dH L H

·(

1− p + p√

1− p)dH L L+dL H L

. (56)

Put dother = m−dH L L−dL H L−dL H H−dH L H . Now givenst, the probability that w1, w2 and w3 attain the given values

of dH L L , dL H L , dL H H and dH L H is(

m

dH L L , dL H L , dL H H , dH L H , dother

)

·(p(1− p)2)dH L L+dL H L

·(p2(1− p))dL H H+dH L H (1− 2p(1− p))dother (57)

which is independent of st. This implies that∑

y′,st,w1,w2,w3

Pr [y′, st, w1, w2, w3]η′

=∑ (

m

dH L L , dL H L , dL H H , dH L H , dother

)

(N − 4)

·(

p(1− p)2(1− p + p√

1− p))dH L L+dL H L

·(

p2(1− p)(p + (1− p)√

p))dL H H+dH L H

· (1− 2p(1− p))dother (58)

(where the sum runs over the possible values of dH L L , dL H L ,

dL H H , and dH L H )

=∑(

m

d−−L , d−−H , dother

)

(N − 4)

·(

p(1− p)5/2(p +√1− p)

)d−−L

·(

p5/2(1− p)(1− p +√p))d−−H

·(

1− 2p + 2p2)dother

(59)

(where the sum runs over the possible values of d−−L =dH L L + dL H L and d−−H = dL H H + dH L H )

= (N − 4)(

p(1− p)5/2(p +√1− p)

+p5/2(1− p)(1− p +√p)+ 1− 2p + 2p2)m

= (N − 4) f2(p)m . (60)

By the above argument, the value Pr [1I0I1, 2I0I2 ∈ Par(y′)for some I1, I2 ∈ UI] for a general ρ is also bounded by theabove value. Hence Proposition 4 follows, by considering thenumber of choices of the pair 1, 2 and the innocent user I0.

To complete the proof of Proposition 4, we give a proofof Lemma 2.

Proof of Lemma 2 First, note that 1/2 ≤ p < 1, therefore0 < 2p − p2 < 1, 0 < 1 − p2 < 1 and 0 < 1 − p +p√

1− p ≤ p + (1 − p)√

p < 1. Now by the definition(52) of η′, for each j ∈ [m] such that w1, j = w2, j �= w3, j ,the value of η′ is increased by setting the j-th bit of the attackword y to be w1, j instead of w3, j or “?” (which makes thevalues of bH L L H and bL H H L smaller).

We consider the case that w1, j = w3, j �= w2, j . If w1, j =ξ H

j , then the contribution of the j-th column to the value η′ isp+(1− p)

√p when y′j = w1, j and 1− p+ p

√1− p when

123

Page 14: Short collusion-secure fingerprint codes against three pirates

98 K. Nuida

y′j = w2, j . On the other hand, if w1, j = ξ Lj , then the contri-

bution of the j-th column to the value η′ is 1− p+ p√

1− pwhen y′j = w1, j and p+(1− p)

√p when y′j = w2, j . Recall

the relation 1− p + p√

1− p ≤ p + (1− p)√

p. Now thesame argument as Lemma 1 implies that Pr [w1, j = ξ H

j ] =p ≥ 1− p = Pr [w1, j = ξ L

j ] in this case. This implies thatthe value of the right-hand side of (53) is not decreased bysetting y′j to be w1, j instead of w2, j (the detail of the proof issimilar to the proof of Lemma 1). Similarly, in the case thatw1, j �= w2, j = w3, j , the value of the right-hand side of (53)is not decreased by setting y′j to be w2, j instead of w1, j .

Summarizing, the value of the right-hand side of (53) isnot decreased by setting y′j to be the majority of w1, j , w2, j ,and w3, j , instead of the minority of them. Hence the maxi-mum value of the right-hand side of (53) is attained by themajority vote attack, concluding the proof of Lemma 2. ��

5.5 Proof of Proposition 5

To prove Proposition 5, we fix an innocent user I and supposethat S(i) < Z for every i ∈ 123. Given y′, w1, w2, w3, andst, we define, for α, β, γ, δ ∈ {H, L},aαβγ δ

= |{ j ∈ [m] | y′j =ξαj , w1, j =ξ

βj , w2, j =ξ

γ

j , w3, j =ξδj }|.(61)

Then, we have

Pr [12I, 13I, 23I ∈ Par(y′) | y′, w1, w2, w3, st]= paH L L H+aH L H L+aH H L L (1− p)aL L H H+aL H L H+aL H H L .

(62)

Let aL and aH be as defined in Sect. 3. For x ∈ {L , H}, let aux

and adx be the number of indices j ∈ [m] of undetectable and

detectable columns, respectively, such that y′j = ξ xj . Note

that aH = auH + ad

H , while we have auH = aH H H H and

auL = aL L L L by Marking Assumption. Now, we have

S(1)+ S(2)+ S(3)

=(

3aH H H H + 2(aH L H H + aH H L H + aH H H L)

+ aH L L H + aH L H L + aH H L L

)log

1

p

+(

3aL L L L + 2(aL L L H + aL L H L + aL H L L)

+ aL L H H + aL H L H + aL H H L

)log

1

1− p

= auH log

1

p+ au

L log1

1− p

+ 2

(

aH log1

p+ aL log

1

1− p

)

− (aH L L H + aH L H L + aH H L L) log1

p

− (aL L H H + aL H L H + aL H H L) log1

1− p, (63)

therefore

(aH L L H + aH L H L + aH H L L) log1

p

+(aL L H H + aL H L H + aL H H L) log1

1− p

= 2

(

aH log1

p+ aL log

1

1− p

)

+ auH log

1

p

+ auH log

1

1− p− S(1)− S(2)− S(3)

> 2

(

aH log1

p+ aL log

1

1− p

)

+ auH log

1

p

+ auL log

1

1− p− 3Z0 (64)

where we used the assumptions that S(i) < Z for everyi ∈ 123 and Z ≤ Z0. By using the relation aL = m − aH

and the definition (7) of Z0, the right-hand side of the aboveinequality is equal to

(3p − 1)m log1

1− p

+ aH

(

(2− 3p) log1

p+ (1− 3p) log

1

1− p

)

+ auH log

1

p+ au

L log1

1− p

−3

√√√√1

2

((

log1

p

)2

aH +(

log1

1− p

)2

aL

)

logN

ε0

= (3p − 1)m log1

1− p+ au

L log1

1− p

+ auH

(

(3− 3p) log1

p+ (1− 3p) log

1

1− p

)

+ adH

(

(2− 3p) log1

p+ (1− 3p) log

1

1− p

)

−3

(1

2

((

log1

1− p

)2

m

−((

log1

1− p

)2

−(

log1

p

)2)

aH

)

logN

ε0

)1/2

(65)

(where we used the relation aH = auH + ad

H )

≥ (3p − 1)m log1

1− p

+ auH

(

(3− 3p) log1

p+ (1− 3p) log

1

1− p

)

123

Page 15: Short collusion-secure fingerprint codes against three pirates

Short collusion-secure fingerprint codes 99

+ adH

(

(2− 3p) log1

p+ (1− 3p) log

1

1− p

)

+ auL log

1

1− p− 3

√1

2m log

N

ε0log

1

1− p(66)

(where we used the fact log(1/(1 − p)) ≥ log(1/p) > 0).By applying the above inequalities to (62), we have

Pr [12I, 13I, 23I ∈ Par(y′) | y′, w1, w2, w3, st]< (1− p)(3p−1)m(1− p)−3

√(m/2) log(N/ε0)

·(

p3−3p(1− p)1−3p)au

H(1− p)au

L

·(

p2−3p(1− p)1−3p)ad

H. (67)

We write the right-hand side of (67) as η. Then, we have

Pr [12I, 13I, 23I ∈ Par(y′) | w1, w2, w3]<

y′,stS(1),S(2),S(3)<Z

Pr [y′, st | w1, w2, w3]η

≤∑

y′,st

Pr [y′, st | w1, w2, w3]η . (68)

Now we present the following key lemma, which will beproven later:

Lemma 3 Among the possible pirate strategies ρ, the maxi-mum value of the right-hand side of (68) is attained by major-ity vote attack ρmaj (cf., Lemma 1).

By (68), we have

Pr [12I, 13I, 23I ∈ Par(y′)]<

w1,w2,w3

Pr [w1, w2, w3]∑

y′,st

Pr [y′, st | w1, w2, w3]η

=∑

y′,st,w1,w2,w3

Pr [y′, st, w1, w2, w3]η . (69)

By virtue of Lemma 3, the maximum value of the right-hand side is attained by majority vote attack ρmaj. Now forρ = ρmaj and given st, the probability that w1, w2, w3 andy′ attain the given values of au

H , auL , and ad

H is(

m

auH , au

L , adH , ad

L

)

(p3)auH ((1− p)3)au

L

· (3p2(1− p))adH (3p(1− p)2)ad

L

(70)

which is independent of st, where we put adL = m − au

H −au

L − adH . Hence we have

y′,st,w1,w2,w3

Pr [y′, st, w1, w2, w3]η

=∑

auH ,au

L ,adH

((m

auH , au

L , adH , ad

L

)

(p3)auH ((1− p)3)au

L

· (3p2(1− p))adH (3p(1− p)2)ad

L η

)

= (1− p)(3p−1)m(1− p)−3√

(m/2) log(N/εL )

·∑((

m

auH , au

L , adH , ad

L

) (p6−3p(1− p)1−3p

)auH

·((1− p)4

)auL(

3p4−3p(1− p)2−3p)ad

H

×(

3p(1− p)2)ad

L)

(71)

(where the sum runs over the possible values of auH , au

L , adH ,

and adL )

= (1− p)(3p−1)m(1− p)−3√

(m/2) log(N/ε0)

·(

p6−3p(1− p)1−3p + (1− p)4

+3p4−3p(1− p)2−3p + 3p(1− p)2)m

= (1− p)(3p−1)m(1− p)−3√

(m/2) log(N/ε0)

·(

p4−3p(p2 − 3p + 3)(1− p)1−3p

+(1− p)2(p2 + p + 1))m

= (1− p)−3√

(m/2) log(N/ε0) f3(p)m . (72)

By the above argument, the value Pr [12I, 13I, 23I ∈Par(y′)] for a general ρ is also bounded by the above value.Hence Proposition 5 follows, as there exist N − 3 choices ofthe innocent user I.

To complete the proof of Proposition 5, we give a proofof Lemma 3.

Proof of Lemma 3 First note that, by Marking Assumption,

the terms in η other than(

p2−3p(1− p)1−3p)ad

H are indepen-dent of the choice of y′ for given w1, w2, and w3. An elemen-tary analysis shows that p2−3p(1− p)1−3p is an increasingfunction of p ∈ [1/2, 1), therefore p2−3p(1 − p)1−3p ≥(1/2)2−3/2(1/2)1−3/2 = 1. Hence the value of η will beincreased by making the value of ad

H as large as possible. Bythe same argument as Lemma 1, under the condition that thej-th column is detectable, the probabilities that the majorityamong w1, j , w2, j , and w3, j is ξ H

j and ξ Lj are p and 1 − p,

respectively. In other words, the probabilities that ξ Hj is the

majority and the minority among w1, j , w2, j , and w3, j arep and 1 − p, respectively. As p ≥ 1 − p, it follows thatthe value of the right-hand side of (68) will not decrease bysetting the j-th bit of y′ to be the majority of w1, j , w2, j , andw3, j instead of the minority of them (the detail of the proof issimilar to the proof of Lemma 1). Hence the maximum valueof the right-hand side of (68) is attained by the majority voteattack, concluding the proof of Lemma 3. ��

123

Page 16: Short collusion-secure fingerprint codes against three pirates

100 K. Nuida

5.6 Proof of Proposition 6

First, we introduce some notations. Given the codewords w1

and w2 of the two pirates 1 and 2, let au and ad denotethe numbers of undetectable and detectable columns, respec-tively. Then by Marking Assumption and the choice p = 1/2,we have S(1) + S(2) = (2au + ad) log 2 regardless of thepirate strategy ρ. This implies that, if S(1) < Z and S(2) <

Z , then we have

(2au + ad) log 2 < 2Z ≤ 2Z0

= m log 2+√

2m logN

ε0log 2. (73)

By the relation au + ad = m, this implies that 2m − ad <

m + √2m log(N/ε0), or equivalently ad − m/2 > m/2 −√

2m log(N/ε0). Now for each j ∈ [m], the probability thatthe j-th column becomes detectable is 1/2, therefore, theexpected value of ad is m/2. Then, Hoeffding’s Inequality(Theorem 5) implies that

Pr [S(1) < Z and S(2) < Z ]≤ Pr [ad − m/2 > m/2−√

2m log(N/ε0)]

≤ exp

⎜⎝−2m2

(m/2−√

2m log(N/ε0))2

m

⎟⎠

= exp

⎜⎝−m2

(√m −√

8 log(N/ε0))2

2

⎟⎠ (74)

provided m/2 −√2m log(N/ε0) > 0. The last condition is

equivalent to that m > 8 log(N/ε0) which is satisfied underthe condition (12). Now put m = 8α log(N/ε0) with α > 1.Then under the condition (12), we have

m2(√

m −√8 log(N/ε0)

)2

2

= m2

2

(√

α ·√

8 logN

ε0−

8 logN

ε0

)2

= 4m2 (√α − 1

)2log

N

ε0

> 162(

logN

ε0

)3 (

1+ 1

16 log(N/ε0)− 1

)2

= logN

ε0, (75)

therefore the right-hand side of (74) is smaller than ε0/N .Hence the proof of Proposition 6 is concluded.

5.7 Proof of Proposition 7

Let 1 ∈ U be the unique pirate. Then by MarkingAssumption and the choice p = 1/2, we have y′ = w1

and S(1) = m log 2, while Z ≤ Z0 = (m/2) log 2 +√(m/2) log(N/ε0) log 2. Now by the assumption m ≥

2 log(N/ε0), we have

S(1)− Z0

log 2= m

2−

√m

2log

N

ε0

=√

m

2

(√m

2−

logN

ε0

)

≥ 0 , (76)

therefore S(1) ≥ Z0 ≥ Z . Hence the proof of Proposition 7is concluded.

5.8 Proof of Theorem 3

First, by the choice of parameters, we have log(N/ε0) =log N + αm ∼ (R log 2 + α)m when m → ∞ and 0 <

R log 2+α < 1/10, therefore the condition (12) in Theorem2 is satisfied for any sufficiently large m. Now it suffices toshow that, when m →∞, the latter three of the four terms in(11) approach (as well as ε0) to 0 exponentially. The secondterm is smaller than N 3(7/8)m/6, and we have

log

(

N 3(

7

8

)m)

= 3 log N + m log7

8∼ (3R log 2+ log 7− 3 log 2)m (77)

when m → ∞, and 3R log 2 + log 7 − 3 log 2 < 0 by thecondition for R. Hence the second term approaches exponen-tially to 0. Similarly, the third term is smaller than 3N 2((10+√

2)/16)m and we have log(N 2((10 + √2)/16)m)/m ∼2R log 2+ log((10+√2)/16) < 0 when m →∞, thereforethe third term also approaches exponentially to 0. Moreover,the natural logarithm of the fourth term is smaller than

log N + 3

√m

2log

N

ε0log 2+ m log

7√

2

16(78)

and this value is asymptotically equal to

m R log 2+ 3

√m

2(m R log 2+ αm) log 2+ m log

7√

2

16

=(

R log 2+3

√R log 2+α

2log 2+log

7√

2

16

)

m. (79)

Now note that if we regard the second term in (15) as a func-tion f (R0) of R0, then the function f (x) is monotonicallydecreasing for x ≤ 1 − log 7/(3 log 2), therefore we have

123

Page 17: Short collusion-secure fingerprint codes against three pirates

Short collusion-secure fingerprint codes 101

α < f (R0) < f (R) by the choice of α. This implies that

R log 2+ α

2<

(R log 2+ log(7

√2/16)

3 log 2

)2

, (80)

while we have R log 2 + log(7√

2/16) < 0, therefore thecoefficient of m in (79) is smaller than

R log 2+ 3−R log 2− log(7

√2/16)

3 log 2log 2

+ log7√

2

16= 0. (81)

Hence the fourth term also approaches exponentially to 0,therefore the proof of Theorem 3 is concluded.

5.9 Proof of Theorem 4

First, we verify the condition (12) in Theorem 2. By thechoice of ε0, we have log(N/ε0) = 3m/40, therefore theright-hand side of (12) is equal to

8 · 3m

40

(

1+ 1

16 · 3m/40

)2

= 3m

5

(

1+ 5

6m

)2

. (82)

By the choice of m and the conditions N ≥ 1 and ε ≤ 1/3,we have m ≥ 9K log 3� ≥ 25, therefore the right-hand sideof (82) is not larger than

3m

5

(

1+ 5

6 · 25

)2

= 3 · 312

5 · 302 m < m. (83)

Hence the condition (12) is indeed satisfied.From now, we evaluate the four terms in (11). The first

term is not larger than

N exp

(

−27K log(N/ε)

40

)

= N

(N

ε

)−27K/40

= N 1−27K/40ε27K/40 (84)

and, as 27K/40 > 1, N ≥ 1 and 0 < ε ≤ 1/3, the right-hand side of the above equality is not larger than 31−27K/40ε.The second term in (11) is not larger than

N 3

6

(7

8

)9K log(N/ε)

= N 3

6

(N

ε

)9K log(7/8)

= 1

6N 3−9K log(8/7)ε9K log(8/7) (85)

and, as 9K log(8/7) = 3 and 0 < ε ≤ 1/3, the right-handside of the above equality is not larger than ε/54. The thirdterm in (11) is not larger than

3N 2β9K log(N/ε) = 3N 2(

N

ε

)9K log β

= 3N 2+9K log βε−9K log β, (86)

where we put β = (10 + √2)/16 for simplicity. As9K log β < −2, N ≥ 1 and 0 < ε ≤ 1/3, the right-handside of the above equality is not larger than 39K log β+2ε.Moreover, by the choice of ε0, the fourth term in (11) isequal to

(N − 3)8√

(m/2)·(3m/40)

(7√

2

16

)m

= (N − 3)

(

8√

15/20 7√

2

16

)m

. (87)

As 8√

15/207√

2/16 < exp(−3/40), the right-hand side ofthe above equality is not larger than ε0, therefore the aboveargument implies that the fourth term in (11) is not largerthan 31−27K/40ε.

Summarizing, the error probability is lower than

(2 · 31−27K/40 + 1/54+ 39K log β+2)ε . (88)

A numerical calculation shows that this value is smaller thanε. Hence the proof of Theorem 4 is concluded.

6 Conclusion

In this article, we proposed a new construction of probabilis-tic 3-secure codes and presented a theoretical evaluation oftheir error probabilities. A characteristic of our tracing algo-rithm is to make use of both score comparison and search ofthe triples of “parents” for a given pirated fingerprint word.Some numerical examples showed that code lengths of ourproposed codes are significantly shorter than the previousprovably secure 3-secure codes.

Acknowledgments A preliminary version of this article was pre-sented at The 12th Information Hiding (IH 2010), Calgary, Canada,June 28–30, 2010 [13]. The author would like to express his deep grat-itude to Dr. Teddy Furon, who gave several invaluable comments andsuggestions as the shepherd of the author’s article in the conference.Also, the author would like to thank the anonymous referees at both theconference and this journal for their precious and detailed comments.

References

1. Amiri, E., Tardos, G.: High rate fingerprinting codes and the fin-gerprinting capacity. In: Proceedings of SODA 2009, pp. 336–345.ACM (2009)

2. Barg, A., Blakley, G.R., Kabatiansky, G.A.: Digital fingerprint-ing codes: problem statements, constructions, identification of trai-tors. IEEE Trans. Inf. Theory 49, 852–865 (2003)

3. Blakley, G.R., Kabatiansky, G.A.: Random coding technique fordigital fingerprinting codes. In: Proceedings of IEEE ISIT 2004,p. 202. IEEE, Los Alamitos (2004)

4. Blayer, O., Tassa, T.: Improved versions of Tardos’ fingerprintingscheme. Des. Codes Cryptogr. 48, 79–103 (2008)

123

Page 18: Short collusion-secure fingerprint codes against three pirates

102 K. Nuida

5. Boneh, D., Shaw, J.: Collusion-secure fingerprinting for digitaldata. IEEE Trans. Inf. Theory 44, 1897–1905 (1998)

6. Cotrina-Navau, J., Fernandez, M., Soriano, M.: A family of collu-sion 2-secure codes. In: Barni, M., Herrera-Joancomartí, J., Kat-zenbeisser, S., Pérez-González, F. (eds.) IH 2005. LNCS, vol. 3727,pp. 387–397. Springer, Heidelberg (2005)

7. Hoeffding, W.: Probability inequalities for sums of bounded ran-dom variables. J. Am. Stat. Assoc. 58, 13–30 (1963)

8. Huang, Y.-W., Moulin, P.: Maximin optimality of the arcsine finger-printing distribution and the interleaving attack for large coalitions.In: Proceedings of IEEE WIFS 2010. IEEE, Los Alamitos (2010)

9. Huang, Y.-W., Moulin, P.: On the saddle-point solution and thelarge-coalition asymptotics of fingerprinting games. e-Print arXiv(cs.IT) 1011.1261v2. http://arXiv.org/abs/1011.1261v2 (2011)

10. Kitagawa, T., Hagiwara, M., Nuida, K., Watanabe, H., Imai, H.:A group testing based deterministic tracing algorithm for a shortrandom fingerprint code. In: Proceedings of ISITA 2008, pp. 706–710 (2008)

11. Nuida, K.: An improvement of short 2-secure fingerprint codesstrongly avoiding false-positive. In: Katzenbeisser, S., Sadeghi,A.-R. (eds.) IH 2009. LNCS, vol. 5806, pp. 161–175. Springer,Heidelberg (2009)

12. Nuida, K.: Making collusion-secure codes (more) robust against biterasure. IACR Cryptology ePrint Archive 2009/549. http://eprint.iacr.org/2009/549 (2009)

13. Nuida, K.: Short collusion-secure fingerprint codes against threepirates. In: Böhme, R., Fong, P.W.L., Safavi-Naini, R. (eds.) IH2010. LNCS, vol. 6387, pp. 86–102. Springer, Heidelberg (2010)

14. Nuida, K., Fujitsu, S., Hagiwara, M., Imai, H., Kitagawa, T., Og-awa, K., Watanabe, H.: An efficient 2-secure and short randomfingerprint code and its security evaluation. IEICE Trans. Fundam.E92-A, 197–206 (2009)

15. Nuida, K., Fujitsu, S., Hagiwara, M., Kitagawa, T., Watanabe, H.,Ogawa, K., Imai, H.: An improvement of discrete Tardos finger-printing codes. Des. Codes Cryptogr. 52, 339–362 (2009)

16. Nuida, K., Hagiwara, M., Kitagawa, T., Watanabe, H., Ogawa, K.,Fujitsu, S., Imai, H.: A tracing algorithm for short 2-secure prob-abilistic fingerprinting codes strongly protecting innocent users.In: Proceedings of IEEE CCNC 2007, pp. 1068–1072. IEEE, LosAlamitos (2007)

17. Schaathun, H.G.: Fighting three pirates with scattering codes. In:Proceedings of IEEE ISIT 2004. IEEE, Los Alamitos (2004)

18. Schaathun, H.G.: Fighting three pirates with scattering codes.Reports in informatics 263, Department of Informatics, Uni-versity of Bergen. http://www.ii.uib.no/publikasjoner/texrap/pdf/2004-263.pdf (2004)

19. Schaathun, H.G.: On the assumption of equal contributions in fin-gerprinting. IEEE Trans. Inf. Forensics Secur. 3, 569–572 (2008)

20. Sebé, F., Domingo-Ferrer, J.: Short 3-secure fingerprinting codesfor copyright protection. In: Batten, L., Seberry, J. (eds.) ACISP2002. LNCS, vol. 2384, pp. 316–327. Springer, Heidelberg (2002)

21. Škoric, B., Katzenbeisser, S., Celik, M.U.: Symmetric Tardos fin-gerprinting codes for arbitrary alphabet sizes. Des. Codes Cryp-togr. 46, 137–166 (2008)

22. Škoric, B., Katzenbeisser, S., Schaathun, H.G., Celik, M.U.: Tardosfingerprinting codes in the combined digit model. In: Proceedingsof IEEE WIFS 2009. IEEE, Los Alamitos (2009)

23. Škoric, B., Vladimirova, T.U., Celik, M.U., Talstra, J.C.: Tardosfingerprinting is better than we thought. IEEE Trans. Inf. The-ory 54, 3663–3676 (2008)

24. Tardos, G.: Optimal probabilistic fingerprint codes. J. ACM 55, 1–24 (2008)

123