Upload
others
View
12
Download
0
Embed Size (px)
Citation preview
A Detection-Theoretic and Computational
Framework for Designing Geometrically Resilient
Watermarking Systems
Pierre Moulin
University of Illinois at Urbana-Champaign
www.ifp.uiuc.edu/˜moulin/talks/wacha05-slides.pdf
WaCha, Barcelona
June 8, 2005
1
Outline
• A communication model for geometric attacks
– Role of Information Theory and Detection Theory
– “Complexity” of geometric attacks
• Example: Unitary Geometric Attack Channels
• Invariant vs GLRT vs Pilot-based WM schemes
2
An Image Watermarking System
secretkey k
Encoderoriginal image S watermarked image X
Picture taken by Alice on January 1, 2000. This messageis going to be embedded foreverin this picture. I challenge youto remove the message withoutsubstantially altering the picture.
1001001101001110100...............101
binary representation
Decoder
Picture taken by Alice on January 1, 2000. This messageis going to be embedded foreverin this picture. I challenge youto remove the message withoutsubstantially altering the picture.
Decoded message
1001001101001110100...............101
Decoded binarymessage
secret key k
Attack
Pirate
11011000...01
3
Attacks on Images
Original JPEG, QF=10 4× 4 median filtering
Gaussian filter (σ = 3) Rotated by 10 degrees Random bending
4
A Communication Model for Geometric Attacks
• Attacker maps watermarked X = (X1, · · · , Xn) into degradedY = (Y1, · · · , Yn) using stochastic mapping p(y|x).
• Distortion function d(x,y)
• Feasible mappings satisfy a distortion constraint
in average: E[d(X,Y)] ≤ D2
or with probability one: d(X,Y) ≤ D2
• Would like “geometrically-inspired” d(x,y)
5
Attack Model and Distortion Function [MM’02]
Geometric
θT ( )yzx
θ
.A(z|x)
AttackMemoryless
Channel
• Geometric (desynchronization) parameter θ ∈ Θ
• Tθ(·) smooth, invertible mapping
• Additive distortion function da(x, z) = 1n
∑ni=1 da(xi, zi)
• Distortion function d(x,y) = minθ∈Θ da(x, T−1θ (y))
invariant to geometric attacks in class {Tθ, θ ∈ Θ}• Maximum distortion level D2 for attacker
6
Information-Theoretic Setup
nMessage
DecoderAttackx y
g ( , )p( | )y x y knf ( ,m, )ks
^Encoder
Host
Key
s
k
M M
• Communications with side information (Gel’fand-Pinsker 1980)
• M uniformly distributed over message set Mn
– Coding problem: R , limn→∞ 1n log2 |Mn| > 0
– Detection problem: Mn independent of n ⇒ R = 0
• Distortion levels D1 and D2
• Class of attacks: Pn , {pY|X}• Attacker knows fn, gn, selects (AZ|X , θ) ∼ pY|X ∈ Pn
7
nMessage
DecoderAttackx y
g ( , )p( | )y x y knf ( ,m, )ks
^Encoder
Host
Key
s
k
M M
• Minmax probability of error:
P ∗e (n,Mn,Pn) = inffn,gn
suppY|X∈Pn
Pe(fn, gn, pY|X)
• Rate R is achievable if lim supn→∞ P ∗e (n,Mn,Pn) = 0
• Supremum of achievable rates is capacity C(D1, D2)
• Error exponent
e∗(R,D1, D2) = lim infn→∞
− 1n
log P ∗e (n,Mn,Pn), 0 ≤ R ≤ C
• Write P ∗e (n,Mn,Pn) .= 2−n e∗(R,D1,D2)
8
• Can derive expression for C(D1, D2) for various classes ofattacks involving additive distortion functions:
– Memoryless attacks [MO’99]
– Max-distortion attacks [CL’01, SM’03]
• Can also derive upper and lower bounds on e∗(R,D1, D2)[SM’04] [MW’04]
• What happens under geometric attacks?
9
Complexity of Geometric Attacks
Geometric
θT ( )yzx
θ
.A(z|x)
AttackMemoryless
Channel
• Consider two cases: receiver knows θ or not
• If receiver knows θ, it can “undo” geometric attacks
• If receiver doesn’t know θ but Θ is compact,
– there is no decrease in capacity; C(D1, D2) is achievedusing traditional decoder, aided by pilot.
– there is not even a decrease in e∗r(R,D1, D2), i.e., thereexists a universal decoder against such geometric attacks
10
Standard WM Codes and Their Limitations
• Example: standard Quantization Index Modulation codesperform well against additive Gaussian attacks but arevulnerable to scaling attacks, delays, warping, etc.
• The main culprit is the minimum-Euclidean-distance decoder
11
Unitary Geometric Attack Channels
• Assume s,x,y ∈ Rn and da(x,y) = ‖x− y‖2
• Tθ is a unitary matrix(geometric attack is linear and preserves signal energy)
• Example: cyclic delay attack
– Attacker performs bandlimited interpolation of x, appliescyclic delay θ ∈ [0, n], and resamples signal
• Assume S ∼ N (0,Σ) and TθΣTTθ is independent of θ
⇒ statistics of S are invariant under Tθ
12
Example: M-ary Watermark Detectionin iid Gaussian Noise
• Code rate R = 0
• Additive spread-spectrum embedding rule x = s + wm
• M ≤ n orthogonal watermarks wm ∈ Rn,each with energy ‖wm‖2 = nD1
• Watermark constellation C = {wm};transformed watermark constellation Cθ = {Tθwm}
• Total noise at receiver ∼ N (0, σ2In)
• Watermark to Noise ratio: WNR = D1/σ2
• Minimum distance of Cθ: dmin =√
2nWNR, same for all θ
13
Coherent Case: Detector knows θ
• Hypothesis test: Hm : Y ∼ N (Tθwm, σ2In), m ∈M• Optimal likelihood ratio test (LRT) is a correlator-detector:
m̂ = argmaxm∈M
yT Tθwm
• Error probability:
Pe ≤ M − 12
Q(dmin/2) .= e−n W NR4
• Computational complexity: no search, just |M| correlations⇒ |M| ops/sample
14
Noncoherent Case: Detector doesn’t know θ
• Hypothesis test:
Hm : Y ∼ N (Tθwm, σ2In), m ∈M, θ ∈ Θ
• Worst-case error probability maxθ∈Θ Pe(fn, gn, θ)
• Can we do (nearly) as well as in the coherent case?
• What kind of detector gn is (nearly) optimal?
• What kind of watermark code fn should we use?
15
Taxonomy for Practical WM Schemes
• Invariant WM schemes
• Generalized Likelihood Ratio Test (GLRT) detectors
• Pilot-aided detection
16
Invariant Watermarks
• Invariant watermark: select embedding domain such thatp(y|θ, Hm) is independent of θ
– θ is nonidentifiable
• Detector has same performance as in coherent case(against memoryless attacks in invariant domain)
• No increase in computational complexity
• Possible loss of robustness against memoryless attacks inoriginal image domain
• And invariant domain does in general not exist!
17
Invariant Detection Tests
• Construct good detection statistics whose distribution isindependent of θ
• Example: noncoherent detection of sinusoids (M -ary FSK)subject to cyclic delay attacks:
wm(i) =√
2D1 sin(2πfmi), 0 ≤ i < n, fm = (K + m)/n
• Detection statistics zm =∣∣∣∑n−1
i=0 y(n)ej2πfmi∣∣∣2
, m ∈M• Detection test: m̂ = argmaxm∈M zm
• Error probability Pe ≤ (M − 1) e−n W NR4
• No loss in error exponent wrt coherent case
18
Generalized Likelihood Ratio Test (GLRT)
• Step 1: Maximum-Likelihood Estimation:
θ̂m , argmaxθ
p(y|θ, Hm)
= argminθ
‖y − Tθwm‖, m ∈M
• Step 2: Correlator Detector:
m̂ = argmaxm∈M
yT Tθ̂mwm
• Asymptotic optimality of GLRT: θ ∈ R, still Pe.= e−n W NR
4 !
• Computational complexity: mostly |M| full searches
19
Pilot-Aided Detection
Signals
Information-BearingPilot
time
• Pilot known to receiver, conveys info about channel law pY|X
• Up to n− 1 orthogonal WM’s wm, each with energy nDw
• Assume pilot p ∈ Rn is orthogonal to {wm}, has energy nDp.
• Transmit watermarked signal x = s + wm + p
• Embedding distortion = nD1 = n(Dw + Dp) ⇒ Dw < D1
20
• Computational complexity: mostly one full search (match p to y)
• Reduces effective WNR by a factor of 1−Dp/D1
and therefore decreases error exponent
• Large estimation errors θ̂ − θ also contribute to Pe
⇒ optimal Dp results from large-deviations analysis
Detection vs computational-complexity tradeoff
21
More General Geometric Attacks
• Generally, Tθ is not unitary, not even linear
θ
θ
θ
n
3
2
1
{T w }
{T w }
y
{T w }
...
• How can we generalize the previous WM design/detectionapproaches?
22
• Invariant WM’s: very hard if not impossible to construct
• GLRT approach:
– due to invertibility and smoothness of Tθ(·),GLRT is asymptotically optimal as n →∞provided Θ is not “too complex” (e.g., Θ ∈ Rd where d ¿ n)
– Proof is based on notion of competitive minimaxity [FM’02]
• Pilot-based approach:capacity-achieving, but lower error exponents
23
Fast Search
• Search for θ̂m = argmaxθ∈Θ p(y|θ,Hm), m ∈M• Computational cost of full search (for discrete Θ) ∼ n |M| |Θ|• Replace full search by partial search
• Analogous to classical signal processing problems such as fastmotion estimation in video, fast image registration, etc.
24
More General Watermarking Codes [M’03]
• Gelfand-Pinsker setup
..
..
.. .
.
G
θ0G
θ1
uGθ0
u’
Gθ1u’
u
• How to make it practical?
25
Conclusion
• Detection performance vs complexity tradeoffs
• Asymptotics (n →∞):
– Assume θ is “low-dimensional” or belongs to compact set
– Invariant watermarks may not exist...
– Pilot-based schemes are capacity-achieving but cause loss inerror exponents
– GLRT is asymptotically optimal for our problem
• In practice:
– GLRT-type detectors with fast search may be attractive
– So are pilot-based schemes, if |Mn| is large
26
References
-[LN’98] Lapidoth and Narayan (IT 1998)
-[FL’98] Feder and Lapidoth (IT 1998)
-[MO’99] Moulin and O’Sullivan 1999 (IT 2003)
-[MM’02] Moulin and Mihcak (IP 2002)
-[CL’01] Cohen and Lapidoth 2001 (IT 2002)
-[FM’02] Feder and Merhav (IT 2002)
-[SM’03] Somekh-Baruch and Merhav (IT 2003)
-[SM’04] Somekh-Baruch and Merhav (IT 2004)
-[M’03] Moulin (SSP 2003)
-[MW’04] Moulin and Wang (ITW 2004)
27