A Detection-Theoretic and Computational Framework for ...moulin/talks/wacha05-slides.pdf · Invariant Watermarks † Invariant watermark: select embedding domain such that p(yjµ;Hm)

A Detection-Theoretic and Computational

Framework for Designing Geometrically Resilient

Watermarking Systems

Pierre Moulin

University of Illinois at Urbana-Champaign

www.ifp.uiuc.edu/˜moulin/talks/wacha05-slides.pdf

WaCha, Barcelona

June 8, 2005

1

Outline

• A communication model for geometric attacks

– Role of Information Theory and Detection Theory

– “Complexity” of geometric attacks

• Example: Unitary Geometric Attack Channels

• Invariant vs GLRT vs Pilot-based WM schemes

2

An Image Watermarking System

secretkey k

Encoderoriginal image S watermarked image X

Picture taken by Alice on January 1, 2000. This messageis going to be embedded foreverin this picture. I challenge youto remove the message withoutsubstantially altering the picture.

1001001101001110100...............101

binary representation

Decoder

Picture taken by Alice on January 1, 2000. This messageis going to be embedded foreverin this picture. I challenge youto remove the message withoutsubstantially altering the picture.

Decoded message

1001001101001110100...............101

Decoded binarymessage

secret key k

Attack

Pirate

11011000...01

3

Attacks on Images

Original JPEG, QF=10 4× 4 median filtering

Gaussian filter (σ = 3) Rotated by 10 degrees Random bending

4

A Communication Model for Geometric Attacks

• Attacker maps watermarked X = (X1, · · · , Xn) into degradedY = (Y1, · · · , Yn) using stochastic mapping p(y|x).

• Distortion function d(x,y)

• Feasible mappings satisfy a distortion constraint

in average: E[d(X,Y)] ≤ D2

or with probability one: d(X,Y) ≤ D2

• Would like “geometrically-inspired” d(x,y)

5

Attack Model and Distortion Function [MM’02]

Geometric

θT ( )yzx

θ

.A(z|x)

AttackMemoryless

Channel

• Geometric (desynchronization) parameter θ ∈ Θ

• Tθ(·) smooth, invertible mapping

• Additive distortion function da(x, z) = 1n

∑ni=1 da(xi, zi)

• Distortion function d(x,y) = minθ∈Θ da(x, T−1θ (y))

invariant to geometric attacks in class {Tθ, θ ∈ Θ}• Maximum distortion level D2 for attacker

6

Information-Theoretic Setup

nMessage

DecoderAttackx y

g ( , )p( | )y x y knf ( ,m, )ks

^Encoder

Host

Key

s

k

M M

• Communications with side information (Gel’fand-Pinsker 1980)

• M uniformly distributed over message set Mn

– Coding problem: R , limn→∞ 1n log2 |Mn| > 0

– Detection problem: Mn independent of n ⇒ R = 0

• Distortion levels D1 and D2

• Class of attacks: Pn , {pY|X}• Attacker knows fn, gn, selects (AZ|X , θ) ∼ pY|X ∈ Pn

7

nMessage

DecoderAttackx y

g ( , )p( | )y x y knf ( ,m, )ks

^Encoder

Host

Key

s

k

M M

• Minmax probability of error:

P ∗e (n,Mn,Pn) = inffn,gn

suppY|X∈Pn

Pe(fn, gn, pY|X)

• Rate R is achievable if lim supn→∞ P ∗e (n,Mn,Pn) = 0

• Supremum of achievable rates is capacity C(D1, D2)

• Error exponent

e∗(R,D1, D2) = lim infn→∞

− 1n

log P ∗e (n,Mn,Pn), 0 ≤ R ≤ C

• Write P ∗e (n,Mn,Pn) .= 2−n e∗(R,D1,D2)

8

• Can derive expression for C(D1, D2) for various classes ofattacks involving additive distortion functions:

– Memoryless attacks [MO’99]

– Max-distortion attacks [CL’01, SM’03]

• Can also derive upper and lower bounds on e∗(R,D1, D2)[SM’04] [MW’04]

• What happens under geometric attacks?

9

Complexity of Geometric Attacks

Geometric

θT ( )yzx

θ

.A(z|x)

AttackMemoryless

Channel

• Consider two cases: receiver knows θ or not

• If receiver knows θ, it can “undo” geometric attacks

• If receiver doesn’t know θ but Θ is compact,

– there is no decrease in capacity; C(D1, D2) is achievedusing traditional decoder, aided by pilot.

– there is not even a decrease in e∗r(R,D1, D2), i.e., thereexists a universal decoder against such geometric attacks

10

Standard WM Codes and Their Limitations

• Example: standard Quantization Index Modulation codesperform well against additive Gaussian attacks but arevulnerable to scaling attacks, delays, warping, etc.

• The main culprit is the minimum-Euclidean-distance decoder

11

Unitary Geometric Attack Channels

• Assume s,x,y ∈ Rn and da(x,y) = ‖x− y‖2

• Tθ is a unitary matrix(geometric attack is linear and preserves signal energy)

• Example: cyclic delay attack

– Attacker performs bandlimited interpolation of x, appliescyclic delay θ ∈ [0, n], and resamples signal

• Assume S ∼ N (0,Σ) and TθΣTTθ is independent of θ

⇒ statistics of S are invariant under Tθ

12

Example: M-ary Watermark Detectionin iid Gaussian Noise

• Code rate R = 0

• Additive spread-spectrum embedding rule x = s + wm

• M ≤ n orthogonal watermarks wm ∈ Rn,each with energy ‖wm‖2 = nD1

• Watermark constellation C = {wm};transformed watermark constellation Cθ = {Tθwm}

• Total noise at receiver ∼ N (0, σ2In)

• Watermark to Noise ratio: WNR = D1/σ2

• Minimum distance of Cθ: dmin =√

2nWNR, same for all θ

13

Coherent Case: Detector knows θ

• Hypothesis test: Hm : Y ∼ N (Tθwm, σ2In), m ∈M• Optimal likelihood ratio test (LRT) is a correlator-detector:

m̂ = argmaxm∈M

yT Tθwm

• Error probability:

Pe ≤ M − 12

Q(dmin/2) .= e−n W NR4

• Computational complexity: no search, just |M| correlations⇒ |M| ops/sample

14

Noncoherent Case: Detector doesn’t know θ

• Hypothesis test:

Hm : Y ∼ N (Tθwm, σ2In), m ∈M, θ ∈ Θ

• Worst-case error probability maxθ∈Θ Pe(fn, gn, θ)

• Can we do (nearly) as well as in the coherent case?

• What kind of detector gn is (nearly) optimal?

• What kind of watermark code fn should we use?

15

Taxonomy for Practical WM Schemes

• Invariant WM schemes

• Generalized Likelihood Ratio Test (GLRT) detectors

• Pilot-aided detection

16

Invariant Watermarks

• Invariant watermark: select embedding domain such thatp(y|θ, Hm) is independent of θ

– θ is nonidentifiable

• Detector has same performance as in coherent case(against memoryless attacks in invariant domain)

• No increase in computational complexity

• Possible loss of robustness against memoryless attacks inoriginal image domain

• And invariant domain does in general not exist!

17

Invariant Detection Tests

• Construct good detection statistics whose distribution isindependent of θ

• Example: noncoherent detection of sinusoids (M -ary FSK)subject to cyclic delay attacks:

wm(i) =√

2D1 sin(2πfmi), 0 ≤ i < n, fm = (K + m)/n

• Detection statistics zm =∣∣∣∑n−1

i=0 y(n)ej2πfmi∣∣∣2

, m ∈M• Detection test: m̂ = argmaxm∈M zm

• Error probability Pe ≤ (M − 1) e−n W NR4

• No loss in error exponent wrt coherent case

18

Generalized Likelihood Ratio Test (GLRT)

• Step 1: Maximum-Likelihood Estimation:

θ̂m , argmaxθ

p(y|θ, Hm)

= argminθ

‖y − Tθwm‖, m ∈M

• Step 2: Correlator Detector:

m̂ = argmaxm∈M

yT Tθ̂mwm

• Asymptotic optimality of GLRT: θ ∈ R, still Pe.= e−n W NR

4 !

• Computational complexity: mostly |M| full searches

19

Pilot-Aided Detection

Signals

Information-BearingPilot

time

• Pilot known to receiver, conveys info about channel law pY|X

• Up to n− 1 orthogonal WM’s wm, each with energy nDw

• Assume pilot p ∈ Rn is orthogonal to {wm}, has energy nDp.

• Transmit watermarked signal x = s + wm + p

• Embedding distortion = nD1 = n(Dw + Dp) ⇒ Dw < D1

20

• Computational complexity: mostly one full search (match p to y)

• Reduces effective WNR by a factor of 1−Dp/D1

and therefore decreases error exponent

• Large estimation errors θ̂ − θ also contribute to Pe

⇒ optimal Dp results from large-deviations analysis

Detection vs computational-complexity tradeoff

21

More General Geometric Attacks

• Generally, Tθ is not unitary, not even linear

θ

θ

θ

n

3

2

1

{T w }

{T w }

y

{T w }

...

• How can we generalize the previous WM design/detectionapproaches?

22

• Invariant WM’s: very hard if not impossible to construct

• GLRT approach:

– due to invertibility and smoothness of Tθ(·),GLRT is asymptotically optimal as n →∞provided Θ is not “too complex” (e.g., Θ ∈ Rd where d ¿ n)

– Proof is based on notion of competitive minimaxity [FM’02]

• Pilot-based approach:capacity-achieving, but lower error exponents

23

Fast Search

• Search for θ̂m = argmaxθ∈Θ p(y|θ,Hm), m ∈M• Computational cost of full search (for discrete Θ) ∼ n |M| |Θ|• Replace full search by partial search

• Analogous to classical signal processing problems such as fastmotion estimation in video, fast image registration, etc.

24

More General Watermarking Codes [M’03]

• Gelfand-Pinsker setup

..

..

.. .

.

G

θ0G

θ1

uGθ0

u’

Gθ1u’

u

• How to make it practical?

25

Conclusion

• Detection performance vs complexity tradeoffs

• Asymptotics (n →∞):

– Assume θ is “low-dimensional” or belongs to compact set

– Invariant watermarks may not exist...

– Pilot-based schemes are capacity-achieving but cause loss inerror exponents

– GLRT is asymptotically optimal for our problem

• In practice:

– GLRT-type detectors with fast search may be attractive

– So are pilot-based schemes, if |Mn| is large

26

References

-[LN’98] Lapidoth and Narayan (IT 1998)

-[FL’98] Feder and Lapidoth (IT 1998)

-[MO’99] Moulin and O’Sullivan 1999 (IT 2003)

-[MM’02] Moulin and Mihcak (IP 2002)

-[CL’01] Cohen and Lapidoth 2001 (IT 2002)

-[FM’02] Feder and Merhav (IT 2002)

-[SM’03] Somekh-Baruch and Merhav (IT 2003)

-[SM’04] Somekh-Baruch and Merhav (IT 2004)

-[M’03] Moulin (SSP 2003)

-[MW’04] Moulin and Wang (ITW 2004)

27

Documents

A Detection-Theoretic and Computational Framework for ...moulin/talks/wacha05-slides.pdf · Invariant Watermarks † Invariant watermark: select embedding domain such that p(yjµ;Hm)