13
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 2, NO. 4, DECEMBER 2007 697 Collusion-Resistant Video Fingerprinting for Large User Group Shan He, Student Member, IEEE, and Min Wu, Senior Member, IEEE Abstract—Digital fingerprinting protects multimedia content from illegal redistribution by uniquely marking copies of the con- tent distributed to users. Most existing multimedia fingerprinting schemes consider a user set on the scale of thousands. However, in such real-world applications as video-on-demand distribution, the number of potential users can be as many as 10–100 million. This large user size demands not only strong collusion resistance but also high efficiency in fingerprint construction, and detection, which makes most existing schemes incapable of being applied to these applications. A recently proposed joint coding and em- bedding fingerprinting framework provides a promising balance between collusion resistance, efficient construction, and detection, but several issues remain unsolved for applications involving a large group of users. In this paper, we explore how to employ the joint coding and embedding framework and develop practical algorithms to fingerprint video in such challenging settings as to accommodate more than ten million users and resist hundreds of users’ collusion. We investigate the proper code structure for large-scale fingerprinting and propose a trimming detection technique that can reduce the decoding computational complexity by more than three orders of magnitude at the cost of less than 0.5% loss in detection probability under moderate to high water- mark-to-noise ratios. Both analytic and experimental results show a high potential of joint coding and embedding to meet the needs of real-world large-scale fingerprinting applications. Index Terms—Collusion resistance, joint coding and embedding, video fingerprinting. I. INTRODUCTION W ITH THE advances of broadband communication and compression technologies, an increasing amount of video is being shared among groups of users through the Internet and other broadband channels. In the meantime, piracy becomes increasingly rampant as users can easily du- plicate and redistribute the received video to a large audience. Digital fingerprinting is an emerging technology to protect multimedia content from such unauthorized dissemination, where a unique ID representing each user, called digital fin- Manuscript received January 31, 2007; revised June 18, 2007. This work was supported in part by the U.S. Office of Naval Research under Young Investigator Award N00014-05-10634 and in part by the U.S. National Science Foundation under CAREER Award CCR-0133704. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Ingemar Cox. The authors are with the Department of Electrical and Computer Engineering and the Institute of Advanced Computing Studies, University of Maryland at College Park, College Park, MD 20742 USA (e-mail: [email protected]; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TIFS.2007.908179 gerprinting, is embedded in the copy distributed to this user. When a copy is leaked, the embedded fingerprint can help to trace back the source of the leak. Adversaries may apply various attacks to remove the fingerprints before redistribution. Collusion is a powerful multiuser attack, where a group of users combines their copies of the same multimedia content to generate a new version with fingerprints attenuated or re- moved. In addition to resistance against attacks, three aspects of system efficiency need to be considered when designing an anticollusion fingerprinting system, namely, the efficiency in constructing, detecting, and distributing fingerprinted signals. Construction efficiency concerns the computational complexity involved during the generation of fingerprinted content; if the complexity is high, we say the construction efficiency is low and vice versa. Similarly, detection efficiency is related to the detection computational complexity. The distribution efficiency refers to the amount of bandwidth consumed during the transmission of all fingerprinted signals through a network. The more bandwidth the transmission requires, the lower the efficiency of distribution is. A growing number of techniques have been proposed in the literature to provide collusion resistance in multimedia finger- printing systems. Many of them fall in one of the two cate- gories—the coded versus the noncoded approaches. Orthogonal fingerprinting is a typical example of designing fingerprint sig- nals without an explicit step of coding. It assigns a spread-spec- trum sequence to each user as his or her fingerprint and the se- quences among users are mutually orthogonal [1], [2]. An early work on coded fingerprinting focused on generic data and in- troduced a two-level construction in the code domain to resist up to colluders with high probability [3]. This binary code was later used to modulate a direct spread-spectrum sequence to embed the fingerprints in multimedia signals [4]. Based on an abstract assumption on embedding, Safavi-Naini [5] employed a -ary error correcting code (ECC) constructed as a -traceability code to resist colluders. An anticollusion code based on com- binatorial design was proposed in [6], where each code bit is embedded in an overlapped fashion by modulating a spreading sequence that covers the entire multimedia signal. It has been shown in [7] that ECC fingerprinting allows for much higher efficiency in construction, detection, and distribution of a fin- gerprinted signal than noncoded orthogonal fingerprinting, but it has rather limited collusion resistance. With a new permuted subsegment embedding (PSE) technique proposed in [7], the coding and embedding layers work together and substantially improve the collusion resistance of ECC fingerprinting, pro- viding a better tradeoff between collusion resistance and con- struction/detection efficiency than the noncoded fingerprinting. 1556-6013/$25.00 © 2007 IEEE

Collusion-Resistant Video Fingerprinting for Large User Group

  • Upload
    shan-he

  • View
    213

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Collusion-Resistant Video Fingerprinting for Large User Group

IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 2, NO. 4, DECEMBER 2007 697

Collusion-Resistant Video Fingerprintingfor Large User Group

Shan He, Student Member, IEEE, and Min Wu, Senior Member, IEEE

Abstract—Digital fingerprinting protects multimedia contentfrom illegal redistribution by uniquely marking copies of the con-tent distributed to users. Most existing multimedia fingerprintingschemes consider a user set on the scale of thousands. However,in such real-world applications as video-on-demand distribution,the number of potential users can be as many as 10–100 million.This large user size demands not only strong collusion resistancebut also high efficiency in fingerprint construction, and detection,which makes most existing schemes incapable of being appliedto these applications. A recently proposed joint coding and em-bedding fingerprinting framework provides a promising balancebetween collusion resistance, efficient construction, and detection,but several issues remain unsolved for applications involving alarge group of users. In this paper, we explore how to employ thejoint coding and embedding framework and develop practicalalgorithms to fingerprint video in such challenging settings as toaccommodate more than ten million users and resist hundredsof users’ collusion. We investigate the proper code structurefor large-scale fingerprinting and propose a trimming detectiontechnique that can reduce the decoding computational complexityby more than three orders of magnitude at the cost of less than0.5% loss in detection probability under moderate to high water-mark-to-noise ratios. Both analytic and experimental results showa high potential of joint coding and embedding to meet the needsof real-world large-scale fingerprinting applications.

Index Terms—Collusion resistance, joint coding and embedding,video fingerprinting.

I. INTRODUCTION

WITH THE advances of broadband communication andcompression technologies, an increasing amount of

video is being shared among groups of users through theInternet and other broadband channels. In the meantime,piracy becomes increasingly rampant as users can easily du-plicate and redistribute the received video to a large audience.Digital fingerprinting is an emerging technology to protectmultimedia content from such unauthorized dissemination,where a unique ID representing each user, called digital fin-

Manuscript received January 31, 2007; revised June 18, 2007. This work wassupported in part by the U.S. Office of Naval Research under Young InvestigatorAward N00014-05-10634 and in part by the U.S. National Science Foundationunder CAREER Award CCR-0133704. The associate editor coordinating thereview of this manuscript and approving it for publication was Prof. IngemarCox.

The authors are with the Department of Electrical and Computer Engineeringand the Institute of Advanced Computing Studies, University of Maryland atCollege Park, College Park, MD 20742 USA (e-mail: [email protected];[email protected]).

Color versions of one or more of the figures in this paper are available onlineat http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TIFS.2007.908179

gerprinting, is embedded in the copy distributed to this user.When a copy is leaked, the embedded fingerprint can helpto trace back the source of the leak. Adversaries may applyvarious attacks to remove the fingerprints before redistribution.Collusion is a powerful multiuser attack, where a group ofusers combines their copies of the same multimedia contentto generate a new version with fingerprints attenuated or re-moved. In addition to resistance against attacks, three aspectsof system efficiency need to be considered when designing ananticollusion fingerprinting system, namely, the efficiency inconstructing, detecting, and distributing fingerprinted signals.Construction efficiency concerns the computational complexityinvolved during the generation of fingerprinted content; ifthe complexity is high, we say the construction efficiency islow and vice versa. Similarly, detection efficiency is relatedto the detection computational complexity. The distributionefficiency refers to the amount of bandwidth consumed duringthe transmission of all fingerprinted signals through a network.The more bandwidth the transmission requires, the lower theefficiency of distribution is.

A growing number of techniques have been proposed in theliterature to provide collusion resistance in multimedia finger-printing systems. Many of them fall in one of the two cate-gories—the coded versus the noncoded approaches. Orthogonalfingerprinting is a typical example of designing fingerprint sig-nals without an explicit step of coding. It assigns a spread-spec-trum sequence to each user as his or her fingerprint and the se-quences among users are mutually orthogonal [1], [2]. An earlywork on coded fingerprinting focused on generic data and in-troduced a two-level construction in the code domain to resistup to colluders with high probability [3]. This binary codewas later used to modulate a direct spread-spectrum sequenceto embed the fingerprints in multimedia signals [4]. Based on anabstract assumption on embedding, Safavi-Naini [5] employed a-ary error correcting code (ECC) constructed as a -traceability

code to resist colluders. An anticollusion code based on com-binatorial design was proposed in [6], where each code bit isembedded in an overlapped fashion by modulating a spreadingsequence that covers the entire multimedia signal. It has beenshown in [7] that ECC fingerprinting allows for much higherefficiency in construction, detection, and distribution of a fin-gerprinted signal than noncoded orthogonal fingerprinting, butit has rather limited collusion resistance. With a new permutedsubsegment embedding (PSE) technique proposed in [7], thecoding and embedding layers work together and substantiallyimprove the collusion resistance of ECC fingerprinting, pro-viding a better tradeoff between collusion resistance and con-struction/detection efficiency than the noncoded fingerprinting.

1556-6013/$25.00 © 2007 IEEE

Page 2: Collusion-Resistant Video Fingerprinting for Large User Group

698 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 2, NO. 4, DECEMBER 2007

TABLE IPERFORMANCE COMPARISON OF EXISTING FINGERPRINTING SCHEMES WITH USER NUMBER N = 10 Million

Among the existing fingerprinting schemes, most works con-sider experimental settings with a user number on the order ofa few thousand and a small collusion group of around ten col-luders. However, in real-world applications, such as cable TV,the number of users can be as many as 10–100 million. The po-tential collusion group may involve hundreds of colluders. Mostexisting schemes cannot reach such large user size and high col-lusion resistance requirements. For example, to hold ten millionusers, the Boneh–Shaw scheme [3] gives a code on the order of10 bits that need 22-h video for reliable embedding, and it canonly resist ten users’ collusion. On the other hand, the orthog-onal fingerprinting can be scaled up to hold 10 million userswith a collusion resistance of 100, but the detection computa-tional complexity increases linearly with the number of usersand becomes prohibitively high for large-scale systems.

A related problem to fingerprinting is traitor tracing throughcryptographic keys, where the main goal is to protect the accessof multimedia rather than the content itself. There have beenseveral works studying this type of traitor tracing problem in theliterature [8]–[10], taking into account the large scale of the usergroup. The basic idea of these works is to use codes to establish aset of decryption keys for each user to access the content, whichshares similarity with the coded fingerprinting such as [5] and[11]. When directly extended to the fingerprinting applicationby modulating each code symbol with spreading sequence andadding the sequence to the host signal, these schemes wouldhave rather limited collusion resistance as shown in [5] and [11].This is because the embedding layer is not well utilized.

Table I summarizes the performance of existing finger-printing schemes and those extended from traitor tracingschemes for a group of 10 million users with a probabilityof miss detection on the order of 10 . From this table, wecan see that to hold such a large user group, the Boneh–Shawscheme requires an extremely long host signal and has verylow collusion resistance. For all other schemes, we fix thehost signal length and compare their collusion resistance anddetection computational complexity. We observe a low collu-sion resistance for traditional ECC-based fingerprinting andhigh detection complexity for orthogonal fingerprinting and theanticollusion-code (ACC) fingerprinting.

Different from other fingerprinting schemes, our recently pro-posed joint coding and embedding strategies built on top of ECCand spread-spectrum embedding [7] offer much improved col-lusion resistance while retaining the efficiency in construction,detection, and distribution of the fingerprinted signal. Such ad-vantages of the improved ECC fingerprinting make it attractivefor video applications, especially under the challenging settingsof millions of users and hundreds of colluders. The large-scalesystem, however, introduces several issues that have not been

addressed in the existing literature. First, in a large-scale systemwhere millions of fingerprinted copies need to be generated anddistributed, how to efficiently construct fingerprints with lowcomputational complexity becomes an important issue. Mean-while, the requirements of high collusion resistance place a con-straint on the code construction and embedding. The construc-tion and embedding of the code to meet efficient constructionand high collusion resistance is also an important issue thatneeds to be addressed. Second, the detection process of the ex-isting improved ECC-based fingerprinting is based on searchingfor maximal correlation. Although giving good detection per-formance, this kind of detector would have high computationalcomplexity given the large number of users, which becomes anissue in real-world applications. Therefore, we need to pursuemore efficient detection to reduce the computational complexityand retain good detection performance.

In this paper, we explore the application of the jointcoding-embedding framework to video fingerprinting forlarge-scale systems and address a few major design and algo-rithmic issues. In particular, we first address the issue of codestructure to achieve high collusion resistance and maintain theefficient construction and distribution. We then propose a trim-ming-based detection algorithm that significantly speeds up thedetection while maintaining good detection performance. Toour best knowledge, this is the first work of applying embeddedfingerprinting on a multimedia signal with a challenging usercapacity of tens of millions of users and collusion resistanceperformance of hundreds of colluders.

This paper is organized as follows. Section II provides a gen-eral background on joint coding and embedding ECC finger-printing. In Section III, we address the code structure issue whenapplying joint coding and embedding ECC fingerprinting ona video signal with a large user group, and present the pro-posed efficient detection algorithm along with theoretical anal-ysis. Section IV shows the experimental results on video sig-nals. We close with discussions in Section V and conclusions inSection VI.

II. JOINT-CODING–EMBEDDING ECC FINGERPRINTING

FOR MULTIMEDIA

A. System Framework

A typical framework for ECC multimedia fingerprinting in-cludes a code layer and a spread-spectrum-based embeddinglayer [7]. As the current work builds on such a framework, westart with a brief review on it. At the code layer, each code-word is assigned to one user as his or her fingerprint. For an-ticollusion purposes, the fingerprint code is constructed to havea large minimum distance so that the fingerprint codewords for

Page 3: Collusion-Resistant Video Fingerprinting for Large User Group

HE AND WU: COLLUSION-RESISTANT VIDEO FINGERPRINTING FOR LARGE USER GROUP 699

different users are well separated [5]. To embed a codeword, wefirst partition the host signal into nonoverlapped segments,which can be one frame or a group of frames of video, withone segment corresponding to one symbol. For each segment,we generate mutually orthogonal spread-spectrum sequenceswith equal energy to represent the possible symbol values inthe alphabet. In this paper, we use zero-mean i.i.d Gaussian se-quences to construct these spreading sequences. Each user’s fin-gerprint sequence is constructed by concatenating the spreadingsequences corresponding to the symbols in his or her codeword.Before embedding the spreading sequence, we apply the per-muted subsequent embedding (PSE) technique proposed in [7]to enhance collusion resistance. In permuted subsegment em-bedding, each segment of the fingerprint sequence is partitionedinto subsegments and these subsegments are then randomlypermuted according to a secret key. The permuted fingerprintsequence is added to the host signal through spread-spectrumembedding [12] with perceptual scaling to form the final finger-printed signal.

After receiving the fingerprinted copies, users may collabo-rate and mount collusion attacks. A widely considered collu-sion model in coded fingerprinting is interleaving collusion [5],where each colluder contributes a nonoverlapped set of seg-ments (corresponding to symbols) and these segments are as-sembled to form a colluded copy. Another major type of col-lusion is peformed in the signal domain; a typical example isthe averaging collusion [2], where colluders average the corre-sponding components in their copies to generate a colluded ver-sion. The averaging collusion can be modelled as

(1)

where is the colluded signal, is the host signal, representsadditional noise, represents the fingerprint sequence for user, is the colluder set, and is the number of colluders.

For simplicity in analysis, we assume that the additional noiseunder both types of collusions follows i.i.d. Gaussian distribu-tion. Studies in [13] have shown that for Gaussian-distributedfingerprints with spread-spectrum embedding, a number ofother collusions based on order statistics, such as minimumand min–max collusion attacks, can be well approximated byaveraging collusion plus additive white Gaussian noise.

At the detector side, our goal is to catch at least one colluderwith high probability. We first extract the fingerprint sequenceand inversely permute it according to the secret key used in thepermuted subsegment embedding. Then for every segment ofthe test sequence, we correlate the signal with each possiblesequence representing the alphabet and determine the symbolas the one with the highest correlation. The symbols detectedfrom each segment will form the extracted codeword. We com-pare this codeword with the codebook and identify the colluderthrough minimum distance decoding. Alternatively, we can em-ploy a soft detector to correlate the entire test signal directlywith every user’s fingerprint signal . The user whose finger-print has the highest correlation with the test signal is identifiedas the colluder, i.e., , where

is the total number of users. Here, the detection statisticis defined as

(2)

where is the colluded signal and is the original signal whichis often available to detectors in fingerprinting applications. Thismatched-filter detector takes advantage of the soft informationfrom the embedding layer and provides better detection perfor-mance than the hard detection.

Owing to a relatively small alphabet size compared to thenumber of users as well as one symbol being put in onenonoverlapping media segment, the ECC fingerprinting has thepotential to generate and distribute fingerprinted media in an ef-ficient way. For example, for each frame of a video, a total ofcopies carrying different symbol values can be generated be-forehand; a fingerprinted copy for any user can then be quicklyobtained by assembling appropriate copies of the frames to-gether according to the fingerprint codeword assigned to him orher. The small alphabet size also keeps the computational com-plexity of fingerprint detection lower than the orthogonal finger-printing approach. Our earlier analysis in [7] showed that ECCfingerprinting has a computational complexity ofcompared with of orthogonal fingerprinting, whereis the number of host signal samples. In many practical finger-printing applications for small user groups, we generally have

to ensure fingerprints be reliably embedded in mul-timedia data. This suggests that the first term is dominantand the overall computational complexity becomes . Inapplications with a large number of users, proper coding al-lows for accommodating a large number of users. In these cases,the becomes a nontrivial part of the overall computationalcomplexity and, therefore, we need to consider both terms in thecomplexity expression.

B. Analysis of Collusion Resistance

We measure the collusion resistance of a fingerprintingsystem in terms of the probability of catching one of the truecolluders. According to [7], under averaging collusion andadditional additive white Gaussian noise, the vector of detec-tion statistics ’s defined in (2) follows an -dimensionalGaussian distribution:

with

(3)

where is an all one vector with dimension -by-1, is an-by- covariance matrix whose diagonal elements are 1’s

and off-diagonal elements are ’s, and is the variance of theadditive noise. We use to denote the mean vector for col-luders, and as the mean vector for innocent users. Giventhe same colluder number and fingerprint strength , the

Page 4: Collusion-Resistant Video Fingerprinting for Large User Group

700 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 2, NO. 4, DECEMBER 2007

Fig. 1. P of ECC-based fingerprinting with different intra-fingerprint corre-lation �.

mean correlation values of the test signal with colluders’ fin-gerprints and those with innocents are separated more widelyfor a smaller correlation between each pair of sequences. Thissuggests that in absence of any prior knowledge on collusionpatterns, a smaller leads to a higher colluder detection proba-bility .

To facilitate the study, we derive the expression for the detec-tion probability based on the model in (3) and the results onorder statistics for multivariate Gaussian variables [14], shownin (4)

(4)

where and denote the c.d.f. and p.d.f. of standardGaussian distribution , respectively. The detailed deriva-tion is presented in Appendix A. We numerically examine the

under different values for a system withand under the watermark-to-noise ratio (WNR)

dB, and show the results in Fig. 1. The results are con-sistent with the conjecture that a small value leads to higherdetection accuracy and, thus, is preferred in the system design.

For ECC fingerprinting, the pairwise correlation can be cal-culated by examining the code construction. Codes with a largerminimum distance have a smaller maximal pairwise correlation

and, thus, are preferable. We use the aforementioned modelof equal pairwise correlation with to approximate theperformance of coded fingerprinting. Consider an example ofReed–Solomon code-based fingerprinting with the alphabet size

Fig. 2. Analytical results of ECC-based fingerprinting for N = 10 , � =

0:03 and various N s under WNR = �10 dB.

, dimension , and code length . The total number of code-words is and the minimum distance is .The correlation between fingerprint sequences constructedfrom this code can be approximated as

(5)

We can choose and for a small to obtain good collusionresistance.

III. LARGE-SCALE VIDEO FINGERPRINTING

A video program contains a large amount of data and thepotential users can be on the order of ten million. Thus, thefingerprinting video requires high collusion resistance as wellas efficient generation, detection, and distribution for manyapplications.

With the developed model in Section II-B, we can numeri-cally examine the system’s performance to determine whetherthe coded fingerprinting can meet the challenging requirementson user number and collusion resistance. In the examination,we set the total number of users and correlation

. We examine several possible lengths of embeddable hostsignal as , 2 , 2 and 6 , which correspondto video signals of 30 s, 1 min, 10 min, and 30 min, respectively.The performance of the coded fingerprinting under this settingis shown in Fig. 2, where we consider different values of col-luder number with a WNR at dB. We can see that evenfor a host signal as short as 30 s, the system has the potential toresist more than 50 users’ collusion out of 100 million users. Fora longer signal than 1 min, we are able to resist 100 users’ collu-sion attack. This promising result inspires us to explore the ap-plication of ECC fingerprinting onto video signals with a largeuser group and address several important issues in this section.

A. Fingerprint Construction for Reaching Large Scale

The high data volume in a video stream provides seeminglyabundant spaces for data embedding and offers high degrees of

Page 5: Collusion-Resistant Video Fingerprinting for Large User Group

HE AND WU: COLLUSION-RESISTANT VIDEO FINGERPRINTING FOR LARGE USER GROUP 701

Fig. 3. Coding-embedding structures and performance comparison.(a) Building an outer code on ECC fingerprinting to reach user capacity.(b) Using fingerprint code to reach user capacity and then applying permutedsubsegment embedding. (c) Collusion resistance under interleaving collusionfor both schemes in (a) and (b).

freedom in choosing how to fingerprint. How to apply ECC fin-gerprinting to video signals to achieve large user scale and highattack resistance is an important issue to investigate. One way isto construct the first round of ECC-based fingerprint sequencesas basis sequences and employ another layer of fingerprint codeto reach the user capacity as shown in Fig. 3(a). The fingerprintsequences of the inner layer serve as the alphabet for the outercode. For example, we build an ECC-based fingerprinting withReed–Solomon code to obtain 64 fin-gerprint sequences and apply permuted subsegment embeddingon these sequences. Then, using these 64 sequences as basis se-quences, we apply an outer code of RS arriving atan overall system for a total of users. This scheme providesa more efficient fingerprint construction than the inner ECC fin-gerprint alone since it requires fewer spreading sequences for agiven number of users due to one more code layer and enablesan efficient distribution to users based on the outer code struc-ture. However, based on the performance evaluation of ECC fin-gerprinting in [7], we can see that the outer level code structurein multilevel fingerprinting schemes limits the collusion resis-tance of the whole system, even though its inner level has highcollusion resistance. Attackers may collude by segment-wiseinterleaving (i.e., one segment of the colluded signal (corre-sponding to one symbol in the outer code) comes from one of

the colluders), which is equivalent to the attackers’ applying in-terleaving attack on the code level. Thus, this kind of finger-printing system cannot take advantage of the attack resistanceat the embedding layer and has to rely on the code-level col-lusion resistance, which is usually very low (less than 10) dueto a finite alphabet size of the code. To verify this, we build atwo-layer system with inner , which is embeddedinto the host signal through permuted subsegment embeddingand an outer code to reach the large scale of theuser group. The simulation results under interleaving collusionin Fig. 3(c) show that a dozen colluders are able to defeat thesystem even under a high WNR.

Another way is to directly apply ECC fingerprinting, thatis, to first build a fingerprint code to reach the user capacityand then apply the permuted subsegment embedding to embedthe fingerprint into the video signal, as shown in Fig. 3(b).Continuing with the settings in the example just shown, thissecond approach would first construct a concatenated codeusing and , followed by applyingpermuted subsegment embedding. The results are also shown inFig. 3(c), where we can see a much better collusion resistancecompared with the previous method. The resistance againstinterleaving collusion is increased from 10 colluders to morethan 50 colluders under a WNR of 0 dB and to 40 colludersunder WNR dB. The improvement is due to the step ofrandom permutation before embedding. The random permu-tation prevents the attackers from knowing which subsegmentcorresponds to which symbol. As a result, the attackers cannotidentify the code level and arbitrarily manipulate each symbolto mount the symbol-wise interleaving attack on either aninner code or outer code. Leveraging the embedding layer, thefingerprinting system is able to resist more colluders than thecode level alone. Therefore, designing the fingerprint code toreach user capacity first and then applying permuted subseg-ment embedding to instill randomness to enhance collusionresistance is a preferred way to construct fingerprint signals fora large number of users.

We now examine how to choose code parameters to achievelow correlations among fingerprint sequences for high collusionresistance. The theoretical model in Section II-B suggests that asmall , a large , and a large generally provide good collu-sion resistance. However, these parameters cannot be chosen ar-bitrarily. There are several constraints on them depending on thecode constructions. Specifically, for the Reed–Solomon codeconstruction, we have the following constraints:

System requirement on total user (6)

RS code construction constraint (7)

FP orthogonality in each segment (8)

where is the host signal length, is the total number ofusers, is the alphabet size, and is the code length. Settingat the maximum value of , we derive from (8)

(9)

This means the upper bound of the value is on the order of. Usually in video fingerprinting, the host signal has length

Page 6: Collusion-Resistant Video Fingerprinting for Large User Group

702 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 2, NO. 4, DECEMBER 2007

, and for the Reed–Solomon code. This sug-gests that (6) is a more stringent requirement on . Thus, wecan choose parameters under the constraint of (6) to achieve thedesired tradeoff between the collusion resistance and the com-putational complexity in detection, which is accordingto our previous study [7]. For example, in applications wherethe computational resources at the detector are very limited, wecan first choose a small to achieve a low computational com-plexity for detection and then adjust value to reach a largeuser capacity; on the other hand, if the detection complexity isnot a major concern, the system designer can fix to be 2 and

to minimize the correlation among fingerprint se-quences in order to achieve high collusion resistance. We cansee that these parameters provide a tradeoff between collusionresistance and efficiency, and they should be chosen accordingto the priority of various design requirements.

B. Efficient Detection Through Trimming

Recall that the computational complexity of the detectionis the sum of two terms: for demodulation and forcolluder identification. When the number of users scales up tomillions, the second term becomes a nontrivial part ofthe total computational complexity. In this subsection, we ex-ploit the code structure to significantly reduce the computationalcomplexity.

1) Trimming-Based Detection: We propose employing atrimming process based on the detection results on predefinedsymbol positions, which we refer to as trimming positions.We first calculate the correlation statistics for the set ofsegments corresponding to trimming positions with everypossible spreading sequence. That is

where and are the th segment of the test signal and originalsignal, respectively, and is the spreading sequence for symbol, which we construct as i.i.d. Gaussian sequence. Then, for each

trimming position , we pick the symbols that have a higherstatistic than a threshold as candidate symbols

(10)

The codewords that match candidate symbols in for all of thepositions in are put into a suspicious codeword set

Finally, we apply a matched filter detection of (2) within thesuspicious set to identify the colluder.

The computational complexity of this scheme is determinedby the number of trimming positions. If symbol positionsare used for trimming, the resulting computational complexityfor colluder identification can be reduced from to

, and the reduction is fold. For example, in asystem holding around 1000 users with and ,when we use all of the information symbols for trimming, wecan obtain three orders of magnitude reduction on computa-tional complexity. The detection computational complexity

Fig. 4. Correlation � versus expansion factor .

can be more significantly reduced for a large-scale system withlarger and .

2) Choosing Trimming Positions: In order to find out the cor-responding codeword given the symbol values at certain posi-tions (i.e., to find out given ), we generally needto solve the equation of that contains equa-tions and unknowns. Here, is the parity check matrixof the code, is the codeword vector with known symbols atposition and unknowns at remaining positions, and .The number of solutions for these equations is corre-sponding to the suspicious codewords after the trimming.The computational complexity of solving such an equation arrayis as the two terms here correspond to theGaussian elimination process and the enumeration of allcodewords satisfying the equation array, respectively.

To further reduce the complexity, we employ systematic con-struction for the Reed–Solomon code to build fingerprint codeand use the information symbols for trimming. In the system-atic code construction, the first symbols in the codeword areinformation symbols, and the remaining are parity check sym-bols. These information symbols provide the indices of usersand can be used to easily identify the users and their codewords.The position information of these symbols (or the correspondingsegments) is protected from adversaries by the random permu-tation during the permuted subsegment embedding. In orderto achieve higher detection accuracy, it is desirable to assignmore energy on the trimming symbols. We accomplish this byexpanding the length of fingerprint sequence by times foreach information symbol, so that the segment length becomes

, where is the total sequence lengthand is the codeword length. The segment size for remainingsymbols would be . The expansion can be im-plemented by repeatedly embedding times the sequence cor-responding to the information symbols.

As expansion of the sequences for information symbols in-crease the correlation among fingerprints, we now examine itsimpact on the collusion resistance. Recall that a Reed–Solomoncode with parameters has the minimum distance of

(i.e., any two codewords share at most symbols).For some pairs of codewords, the symbols in common are all

Page 7: Collusion-Resistant Video Fingerprinting for Large User Group

HE AND WU: COLLUSION-RESISTANT VIDEO FINGERPRINTING FOR LARGE USER GROUP 703

Fig. 5. P versus expansion factor by trimming detector at a WNR of (a) �5 dB and (b) �10 dB.

Fig. 6. Performance comparison of trimming detection versus matched filter detection with expansion factor = 3 at WNR of (a) �10 dB and (b) �5 dB.

at the information symbol positions, so after the expansion, thenumber of shared symbols becomes . For codewordsthat have fewer than information symbols in common,the number of shared symbols after the expansion would besmaller than . Therefore, in the expanded code whosecode length is , the minimum distance becomes

, and the maximal cor-relation of the fingerprint sequences would become

(11)

Fig. 4 shows the relationship between the correlation and theexpansion factor .

We can see that the correlation increases as becomes larger;based on the results in (3), the overall detection accuracy maydecrease. On the other hand, high enhances the accuracy oftrimming symbol detection and, thus, may lead to high overallprobability of detection . We examine the effect of the expan-sion factor on and show the results in Fig. 5 at a WNR of0 dB and . In this figure, we set the host signal length

as and the total number of users .The code is constructed by Reed–Solomon code with ,

, and . Both information symbols are used for trim-ming. From the results, we find that there is an optimal valueof for a given to achieve the highest detection probability.For the experimental settings in Fig. 5, achieves the bestdetection results. In the next subsection, we will derive the re-lationship between and various system parameters, and op-timal can be obtained based on the theoretical model. In Fig. 6,we examine of trimming detection and matched-filter detec-tion with under WNR and dB. We cansee that compared with matched-filter detection, the accuracy oftrimming detection is only reduced by less than 0.5% for mod-erate WNRs and less than 6% for low WNRs. In most finger-printing applications, the host signal is usually available to thedetector and its interference can be removed from the test signal,so the common levels of WNR are generally higher than dB.As such, we expect a comparable performance by trimming de-tection to matched-filter detection, and the trimming detectionreduces the decoding computational complexity by . As can

Page 8: Collusion-Resistant Video Fingerprinting for Large User Group

704 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 2, NO. 4, DECEMBER 2007

be seen from the results, the efficiency of trimming detection isat a negligible expense of reduced detection performance thanmatched-filter detection.

3) Performance Analysis: We analyze the performance of theproposed trimming detection in terms of the detection proba-bility. For simplicity, we show the analysis for the case that onlyone information symbol position is used for trimming and thesymbol with the highest correlation statistic is chosen to trimthe codebook. Thus, after trimming, codewords remainin the codebook. We assume random collusion in our analysis(i.e., all of the users have equal probability to participate in thecollusion).

Without loss of generality, we select the first symbol posi-tion as the trimming position and its corresponding frame sizeis . We call the symbol in the al-phabet as color and, thus, there are colors for a -ary code-book. Note that at any given position of a linear code, such asthe Reed–Solomon code considered in this paper, each color oc-curs the same number of times among all of the codewords.That is, each color occurs in codewords among a totalof codewords. When considering a random collusion attack,the effect of forming a -colluder group on the first symbolposition is to choose symbols from symbols. The totalnumber of possibilities for such a -colluder group are .Denote the colluders’ color pattern at the trimming position asa vector and , where isthe total number of colors and is the number of symbolswith color . For example, a vector means thatamong the first symbol position of five colluders’ codewords,there are three different colors in total: one color appears once,one appears twice, and another also appears twice. The exactcolor values do not affect the analysis here due to the symmetryof the code, the randomness of the collusion, and the mutualindependence of the fingerprint sequences that represent dif-ferent colors. For example, for the color pattern [1 2 2] andcolor alphabet , the instance of and

would result in the same detection probabilityon the colors. Given this vector , we can derive the probabilityof detection as

catch one colluder inside subcode

subcode is picked

Here, subcode refers to the set of codewords having colorat the trimming position; is calculated accordingto (4) with replaced by and replaced by ; and (11)is used for calculating . The probability of subcode beingpicked at the trimming position is calculated by

(12)

Fig. 7. Results of trimming detection at WNR = �10 dB and �5 dB.

where is the detection statistic for symbol and is thep.d.f. of . Note that due to the presence of white Gaussiannoise and the orthogonality of the sequences representing dif-ferent symbols, ’s are independent Gaussian variables withequal variance. For those colors that are not contained in thecolluders’ codewords, ’s are set to 0.

After obtaining , we can calculate the overall probabilityof detection as

(13)

Here, is the set of all color patterns for colluders, and itcan be recursively expressed in a matrix form

where denote the set of row vectors in excluding therows containing the element in . represents an all “1” columnvector with the same number of entries as the row number of

, and represents an all “2” column vector with the samenumber of entries as the row number of , and so on. Inthe above matrix, each row vector contains elements and rep-resents one possible color pattern generated from colluders.For each color pattern in , it can be shown that

(14)

with

(15)

Here, is an auxiliary vector representingthe histogram of a color pattern , is the number of distinctvalues in vector and is the number of occurrences of the thvalue in vector . The detailed derivations for and arepresented in Appendix B.

Page 9: Collusion-Resistant Video Fingerprinting for Large User Group

HE AND WU: COLLUSION-RESISTANT VIDEO FINGERPRINTING FOR LARGE USER GROUP 705

Fig. 8. Experimental results. (a) Original frame. (b) Fingerprinted frame before attack with PSNR = 32 dB. (c) 3-Mb/s MPEG compressed frame.

According to the previous analysis, we are able to derive thematrix for any value and obtain the value for each

vector (each row of ). By plugging these quantities into(13), we obtain the analytical expression of the probability ofdetection for the trimming method. The optimal parameter

can be chosen to maximize so that

The analysis for a more general trimming detection can beconducted in a similar way. For example, we can set a thresholdfor the trimming detection statistic to select multiple symbolsas shown in (10). The only changes that need to be made is thecalculation of , which now becomes

(16)

where is a threshold, and

with . Multiple-positiontrimming can be analyzed by iteratively applying the processjust shown.

We numerically examine the analysis result of (13) with theparameter settings of , , and . Thetrimming detection is performed only on the first symbol posi-tion and the color with the highest detection statistic is pickedfor trimming. The results for two WNR values ( 10 dB and

5 dB) are shown in Fig. 7, along with the simulation results atthe same parameter settings. We can see that the numerical re-sults match the simulation results well for most of the values.The small gap at some points comes from the simplified anal-ysis and numerical computation on the combination terms in

and the calculation of , where a simplification of

equal correlation is assumed in the analysis while the actualfingerprints have different correlation values according to thecodebook construction.

IV. EXPERIMENTAL RESULTS

A. Experimental Settings and Results

In this section, we apply the proposed fingerprinting schemeon video signals and examine the experimental performance.The test video signal is obtained from [18] and has a videographics array (VGA) size of 640 480. The total number ofusers that we target is on the order of ten million and the colludernumber is around 100. We choose a Reed–Solomon code with

, , and , which leads to the number of users. The expansion parameter in the efficient

detection is set at 3, and the first four symbol positions are se-lected for trimming. Thus, the equivalent codeword length is 71and the resulting pairwise maximum correlation isaccording to (11).

During the fingerprint embedding, each frame is transformedinto the discrete cosine transform (DCT) domain. Fingerprintsequences are embedded into these DCT coefficients throughadditive embedding with perceptual scaling. The host videosignal has 852 frames to carry one codeword symbol for aboutevery 12 frames. For simplicity, we repeatedly embed the samefingerprint sequence into every group of frames that consistsof six consecutive frames. The issue of intravideo collusionattack will be discussed in Section V-B. Subsegment partitionfactor is set as 24. Fig. 8(a)–(c) shows the 500th frame in theoriginal, fingerprinted and compressed video sequences. Fig. 9shows the simulation results on the probability of catching onecolluder versus colluder number , under averaging andinterleaving collusion attacks followed by MPEG-2 compres-sion. The curves are obtained by averaging the results of 50iterations.

The results shown in Fig. 9 are encouraging in that withless than 30-s video, we are able to hold more than 10 millionusers and resist more than 100 users’ averaging collusionand 60 users’ interleaving collusion. The resistance can befurther improved by increasing the video sequence lengthand employing a larger value, and trading off the reducedefficiency in distributing fingerprinted signals. In comparison,without the joint coding and embedding approach, the system

Page 10: Collusion-Resistant Video Fingerprinting for Large User Group

706 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 2, NO. 4, DECEMBER 2007

Fig. 9. Probability of catching one colluder P versus colluder number c under averaging and interleaving collusion followed by (a) 3M b/s, (b) 1.5M b/s MPEG-2compression, and (c) under min–max attack followed by 3M b/s MPEG-2 compression.

can only resist about two users’ collusion as indicated bythe dashed line in Fig. 9(a). We can see that the joint codingand embedding strategy can help overcome the code-levellimitation and substantially improve the collusion resistance ataffordable computational complexity. We further reduce theMPEG-2 compression bit rate down to 1.5 Mb/s. Even underthis stronger compression, the collusion resistance only reducesmoderately from 60 to 50 colluders under interleaving collu-sion; the resistance under averaging collusion is still higherthan 100 colluders.

B. Results Under Nonlinear Collusion Attack

The experimental and analytical results that we have pre-sented so far are based on averaging collusion and interleavingcollusion. Our next experiments examine the collusion resis-tance under min–max attack which can be used as a represen-tative of nonlinear collusion.1 In the min–max attack, colluderschoose the average of the minimum and maximum values oftheir copies in each DCT coefficient position to generate thecolluded version. MPEG-2 compression is further applied tothe colluded signal. Fig. 9(c) shows the detection probability

versus the colluder number under the min-max attack fol-lowed by 3-Mb/s MPEG-2 compression. The results show thatunder this nonlinear collusion attack, the collusion resistance isaround 80 colluders, which further demonstrates the effective-ness of the proposed large-scale fingerprinting. Its performancegap compared with that under averaging collusion is mainly be-cause the min-max collusion introduces higher distortion on thecolluded signal than the averaging collusion [13]. For 80 users’collusion, the mean squared error (MSE) introduced to the hostsignal by an averaging attack is 0.94 and is 3.57 by the min–maxattack before MPEG-2 compression.

It is worth mentioning that the computational complexity oforder statistics-based nonlinear collusion significantly increasesbecause of the sorting involved. In the examined experimentalsettings, the nonlinear collusion attack from 80 colluders re-quires 12 h while the averaging collusion only needs 70 min.This high computational complexity can also help deter the col-luders from employing the nonlinear collusion on video espe-cially for a large colluder group.

1Interested readers may refer to [13] for detailed analysis on the relationshipand comparison among various nonlinear collusion attacks.

TABLE IITIME CONSUMPTION OF THE PROPOSED EFFICIENT

FINGERPRINTING EMBEDDING AND DETECTION WITH USER

NUMBER N = 16 MILLION AND 852 VGA FRAMES

C. Performance Summary

We summarize the time consumption of the embedding anddetection in Table II. The efficient scheme for embedding refersto the way that we generate all possible versions beforehandfor each subsegment and concatenate these subsegments ac-cording to each user’s fingerprint code. Fully embedding meansthat we perform the embedding on the entire host signal forevery user without exploring the code structure. The efficientscheme for detection refers to the trimming detection proposedin Section III-B. Standard compilation from Microsoft VisualStudio has been applied to the C++ implementations of bothschemes. We can see from Table II that the code structure ofthe fingerprinting speeds up the fingerprinted signal generationby 6–7 times, and the proposed trimming detection only con-sumes less than half of the time required by the matched-filterdetection.

V. DISCUSSIONS

A. Relation to Group-Based ECC Fingerprinting

The proposed trimming detection utilizes the code structureto first detect trimming symbols and then uses these symbols totrim the codebook. This process is similar to the detection ofGRACE fingerprinting proposed in an earlier work [7]. In thisgroup-based fingerprinting, the group symbols are first extractedand then only codewords inside the extracted groups are putunder suspicion for further examination. However, the GRACEfingerprinting cannot be applied here to provide efficient detec-tion for the following reasons. We recall that the efficiency ofthe joint coding and embedding fingerprinting lies in the codestructure when many copies share the same segment. By pre-generating those segments, we can assemble the fingerprintedsignal for each user according to his or her codeword to meetthe real-time requirement. However, in GRACE fingerprinting,

Page 11: Collusion-Resistant Video Fingerprinting for Large User Group

HE AND WU: COLLUSION-RESISTANT VIDEO FINGERPRINTING FOR LARGE USER GROUP 707

Fig. 10. Illustration of hash-based fingerprint construction.

the group information is spread over the entire signal and is dif-ferent for users from different groups. After adding the group in-formation, users have few segments in common; thus, we cannotutilize the efficient construction as before. While it is possibleto implement group fingerprinting by appending the group in-formation to maintain the efficiency in fingerprint construction,this scheme is vulnerable to the framing attack discussed in [7]because colluders can identify the shared subsegments, make agood guess on the group information, and circumvent the de-tection by some contributing only user information and otherscontributing only group information.

In contrast, the proposed trimming detection takes advantageof the inherent code structure and randomization in embedding,which cannot be easily identified by the colluders but can beexplored by the detector to perform the efficient detection.

B. Intravideo Collusion

The large amount of data in video is a double-edged swordas it also benefits the attackers. Given one copy of the finger-printed video, an attacker may apply multiple-frame collusion[15], where several frames are used to estimate and eventuallyremove the fingerprint. One possible implementation of such in-travideo attacks is that the attacker may average several framesthat have “visually similar” content but are embedded with in-dependent fingerprint sequences. By collecting enough frames,this averaging operation can successfully remove the embeddedfingerprint at a possible expense of reduced visual quality. Fur-thermore, this attack can be extended to object-based collusion,where similar objects are identified and averaged or swappedto circumvent the detection. Another possible attack is that theattacker may identify several “visually dissimilar” frames or re-gions embedded with the same fingerprint sequences and av-erage these frames to estimate the embedded fingerprint. Thisestimated fingerprint sequence will then be subtracted from thefingerprinted signal to obtain an approximation of the originalframe. In each of the above cases, the attacker can succeed byattacking just one fingerprinted copy without help from othercolluders. Therefore, the design and embedding of the finger-print sequence should be robust to these intravideo attacks.

A basic principle to resist these attacks is to embed finger-print sequences based on the content of the video (i.e., similarfingerprint sequences are embedded in frames/objects with sim-ilar content and different fingerprint sequences are embeddedin frames with different content [16]). For example, a scheme

proposed by Fridrich employs the content-based hash of eachsegment of the signal as a key to generate watermark sequences[17]. However, this method cannot be directly used in the ECC-based fingerprinting because of two reasons. First, the finger-print spreading sequences in ECC fingerprinting are mainly de-termined by the code structure and the designer does not havethe freedom to generate the sequences according to the content.In ECC-based fingerprinting, the spreading sequences for dif-ferent symbols should be orthogonal to each other. On the otherhand, the content-based construction of fingerprint sequenceswould lead to correlated spreading sequences for different sym-bols, which conflicts with the fingerprint construction for ECCfingerprinting. Second, the correlation between the watermarksequences for two frames generated in [17] decreases exponen-tially as the Hamming distance between their hash values in-creases. This change is so dramatic that it may result in visuallysimilar frames having quite different watermarks. Therefore, toaddress the intravideo collusion attack, we need to find a mech-anism that is able to convert the sequence structure imposed bythe codeword to what the video content demands, and build con-tent-based fingerprints such that the difference between finger-prints of two frames is linear with respect to the difference offrame content.

Since consecutive frames in one scene are visually similar, wecan repeatedly embed the same fingerprint sequence into thoseconsecutive frames. For those visually dissimilar frames that areassigned similar fingerprint sequences based on ECC construc-tion, we need to modify the fingerprint sequences to be dissim-ilar. To achieve this goal, we can use the hash of each frameto adjust the fingerprint sequences. For example, frame andframe have hash and , respectively, each of which is a

-bit binary sequence. If frame and frame are visually sim-ilar, the Hamming distance between and , ,would be very small (i.e., only a few bit positions are different);if they are visually different, will be very large.As illustrated in Fig. 10, we choose a binary secret key withthe same length as the frame hash to generate a fingerprint se-quence. For each frame , we generate new keys by replacing

’s th bit with ’s th bit, . Meanwhile, the fin-gerprint sequence for the current frame is divided into blocks.Each of these new keys is used to permute one block. Theconcatenation of these permuted blocks will be the final fin-gerprint sequence for frame . With this method, visually similarframes will have many permuted blocks in common; thus, the

Page 12: Collusion-Resistant Video Fingerprinting for Large User Group

708 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 2, NO. 4, DECEMBER 2007

overall sequence will be similar; and visually dissimilar frameswill have dissimilar sequences. The correlation between the se-quences changes linearly with the Hamming distance betweentwo frames’ hashes.

VI. CONCLUSION

In this paper, we consider fingerprinting video signal underchallenging settings of accommodating millions of users and re-sisting hundreds of users’ collusion. Our recent work of jointcoding and embedding ECC fingerprinting has shown an ex-cellent tradeoff between collusion resistance and efficient con-struction and detection. Building upon this improved ECC fin-gerprinting, we address issues associated with large-scale videofingerprinting, including designing code structure and speedingup detection. We have found that building fingerprint code firstto reach the large scale of the system and then applying the jointcoding and embedding scheme is preferred for a good tradeoffbetween efficient generation and collusion resistance rather thanbuilding an outer layer code on the joint coding and embed-ding scheme. The proposed trimming approach can speed upthe decoding by three orders of magnitude with only less than0.5% drop on detection accuracy. With the proposed fingerprintconstruction and efficient detection, the system holding 16 mil-lion users can resist 50–60 colluders’ interleaving collusion andmore than 100 users’ averaging collusion as well as 80 users’nonlinear collusion. The user capacity and collusion resistancecan be further increased by adjusting such system parametersas code dimension and number of subsegments . Both anal-ysis and the experimental results show a strong potential of jointcoding and embedding ECC fingerprinting for large-scale videofingerprinting applications.

APPENDIX ADERIVATION OF DETECTION PROBABILITY OF (4)

Based on the distribution of the detection statistic in (3), wederive the probability of detection as follows. We denote themaximum detection statistic for colluder and innocent user as

and , respectively. That is

The probability of catching one colluder using the maximumdetector of (2) can be expressed as

(17)

The approximation in the equation just shown comes from thesimplification on the independence between and due tosmall correlation . To obtain an expression of , we employthe results on order statistics from [14]: lethave an -dimension Gaussian distribution such thatthe mean value , variance of , and the correlation coef-ficient of and satisfy

Then, the probability density function (pdf) and cumulative dis-tribution function (cdf) of the th smallest variable , de-noted as and , are

for

where and Here,and denote cdf and pdf of standard Gaussian distri-

bution , respectively. Based on this result, we can ob-tain the cdf for and pdf for by plugging in the propervalue with for colluder and for the innocent user. The ex-pression for the probability of detection becomes what is shownin (4).

APPENDIX BDERIVATION OF AND

From (13), we can see that to obtain the overall detectionprobability , we need to obtain two quantities: the set ofpossible patterns for colluders and the probabilityfor each pattern to occur . We assume that the collusionoccurs randomly among a total of users, and the totalnumber of possible choices for colluders is . For a givencolor pattern , the number of possibilitiesis , where is the number of instances of color

pattern and the term is the number of choicesof colluders’ codeword that has occurrences for each color.Then, the can be obtained by

To calculate , let us look at an example with andand use to denote the four colors in the

alphabet. The possible instances of a color patternare

giving a total of instances. For anyvalue of , , and , the value of can be obtained as fol-lows. For a given color patternleading by nonzero elements, we derive a histogram vector

, where is the number of distinct valuesin vector and is the number of occurrences of the th valuein vector . For the example just shown , we have

, indicating that there are two “1’s” and one “2” in .Apparently, . Then, can be calculated by

(18)

Page 13: Collusion-Resistant Video Fingerprinting for Large User Group

HE AND WU: COLLUSION-RESISTANT VIDEO FINGERPRINTING FOR LARGE USER GROUP 709

The next step is to derive the color patterns for a given ,denoted as . In the following, we use a matrix to represent

in which there is a color pattern in each row. We observethat , , . For any , it can beshown that

For any

where denotes the set of vectors in excluding the rowscontaining an element in . represents an all “1” columnvector with the same number of entries as the row numberof , and represents an all “2” column vector with thesame number of entries as the row number of , and so

on. If for some is empty, then the submatrix

will not appear in . In the equations justshown, each row vector contains elements. For simplicity inrepresentation, we omit the zeros at the end of each row. A fullvector can be obtained by simply appending zeros at the end ofeach row to reach elements.

REFERENCES

[1] F. Ergun, J. Kilian, and R. Kumar, “A note on the limits of collusion-resistant watermarks,” in Proc. Eurocrypt , 1999, pp. 104–149.

[2] Z. J. Wang, M. Wu, H. Zhao, W. Trappe, and K. J. R. Liu, “Anti-col-lusion forensics of multimedia fingerprinting using orthogonal modu-lation,” IEEE Trans. Image Process., vol. 14, no. 6, pp. 804–821, Jun.2005.

[3] D. Boneh and J. Shaw, “Collusion-secure fingerprinting for digitaldata,” IEEE Trans. Inf. Theory, vol. 44, no. 5, pp. 1897–1905, Sep.1998.

[4] Y. Yacobi, “Improved Boneh-Shaw content fingerprinting,” in Proc.CT-RSA , 2001, vol. 2020, Lecture Notes Comput. Sci., pp. 378–391.

[5] R. Safavi-Naini and Y. Wang, “Collusion secure q-ary fingerprintingfor perceptual content,” in Proc. Security and Privacy in Digital RightsManagement, 2002, pp. 57–75.

[6] W. Trappe, M. Wu, Z. J. Wang, and K. J. R. Liu, “Anti-collusion fin-gerprinting for multimedia,” IEEE Trans. Signal Process., Special IssueSignal Process. Data Hiding Digital Media Secure Content Del., vol.51, no. 4, pp. 1069–1087, Apr. 2003.

[7] S. He and M. Wu, “Joint coding and embedding techniques for multi-media fingerprinting,” IEEE Trans. Inf. Forensics Security, vol. 1, no.2, pp. 231–247, Jun. 2006.

[8] H. Jin, J. Lotspiech, and S. Nusser, “Traitor tracing for prerecorded andrecordable media,” in Proc. ACM Workshop on Digital Rights Manage-ment, Washington, DC, Oct. 2004, pp. 83–90.

[9] H. Jin and J. Lotspiech, “Attacks and forensic analysis for multimediacontent protection,” in Proc. IEEE Int. Conf. Multimedia Expo, Ams-terdam, The Netherlands, Jul. 2005, pp. 1392–1395.

[10] H. Jin and J. Lotspiech, “Hybrid traitor tracing,” in Proc. IEEEInt. Conf. Multimedia Expo, Toronto, ON, Canada, Jul. 2006, pp.1329–1332.

[11] M. Fernandez and M. Soriano, “Soft-decision tracing in fingerprintedmultimedia content,” IEEE Trans. Multimedia, vol. 11, no. 2, pp.38–46, Apr.–Jun. 2004.

[12] I. Cox, J. Kilian, F. Leighton, and T. Shamoon, “Secure spread spec-trum watermarking for multimedia,” IEEE Trans. Image Process., vol.6, no. 12, pp. 1673–1687, Dec. 1997.

[13] H. V. Zhao, M. Wu, Z. J. Wang, and K. J. R. Liu, “Forensic analysis ofnonlinear collusion attacks for multimedia fingerprinting,” IEEE Trans.Image Process., vol. 14, no. 5, pp. 646–661, May 2005.

[14] Y. L. Tong, The Multivariate Normal Distribution. New York:Springer-Verlag, 1990.

[15] K. Su, D. Kundur, and D. Hatzinakos, “Statistical invisibility for col-lusion-resistant digital video watermarking,” IEEE Trans. Multimedia,vol. 7, no. 1, pp. 43–51, Feb. 2005.

[16] M. D. Swanson, B. Zhu, and A. H. Tewfik, “Multiresolution scene-based video watemarking using perceptual models,” IEEE J. Select.Areas Commun., vol. 16, no. 4, pp. 540–550, May 1998.

[17] J. Fridrich, “Visual hash for oblivious watermarking,” in Proc. SPIESecurity and Watermarking of Multimedia Contents II, 2000, vol. 3971,pp. 286–294.

[18] Sample Video Sequences, Tech. Univ. Munchen, Munich, Germany.[Online]. Available: http://www.ldv.ei.tum.de/page70?LANG=EN.

Shan He (S’05) received the B.E. and M.S. degreesin automatic control and industrial engineering (withthe highest honors) from Tsinghua University, Bei-jing, China, in 1999 and 2002, respectively, and thePh.D. degree in electrical engineering from Univer-sity of Maryland, College Park, in 2007.

She was a Research Intern with MicrosoftResearch, Redmond, WA, in 2006. Her research in-terests include information security and multimediasignal processing.

Dr. He received the Best Master Thesis Awardfrom Tsinghua University in 2002 and the Graduate School Fellowship from theUniversity of Maryland (2002–2004). She received a Best Student Paper Awardat the 2006 IEEE International Workshop on Multimedia Signal Processingand an ECE Distinguished Dissertation Fellowship from the University ofMaryland, College Park, in 2007.

Min Wu (S’95–M’01–SM’06) received the B.E.degree (Hons.) in electrical engineering and B.A.degree in economics (Hons.) from Tsinghua Univer-sity, Beijing, China, in 1996 and the Ph.D. degreein electrical engineering from Princeton University,Princeton, NJ, in 2001.

Since 2001, she has been with the faculty of theDepartment of Electrical and Computer Engineeringand the Institute of Advanced Computer Studies,University of Maryland at College Park, whereshe is currently an Associate Professor. Previously,

she was with the NEC Research Institute and Panasonic Laboratories. Shecoauthored Multimedia Data Hiding (Springer-Verlag, 2003) and MultimediaFingerprinting Forensics for Traitor Tracing (EURASIP/Hindawi, 2005), andholds five U.S. patents. Her research interests include information security andforensics, multimedia signal processing, and multimedia communications.

Dr. Wu received a National Science Foundation CAREER Award in 2002,a University of Maryland at College Park George Corcoran Education Awardin 2003, a Massachusetts Institute of Technology Technology Review’s TR100Young Innovator Award in 2004, and an Office of Naval Research (ONR) YoungInvestigator Award in 2005. She was a corecipient of the 2004 EURASIP BestPaper Award and a 2005 IEEE Signal Processing Society Best Paper Award.She served as Finance Chair for the 2007 IEEE International Conference onAcoustic, Speech, and Signal Processing (ICASSP), and is an Associate Editorof IEEE SIGNAL PROCESSING LETTERS and an Area Editor for the E-Newsletterof IEEE Signal Processing Magazine.