Andrea Abrardo - unipr.it

Andrea AbrardoDept. of Information Engineering,

University of Siena – ITALY

• Robust data hiding problem• Data hiding Applications• Informed coding and informed

embedding• Practical encoding schemes• Robust Data hiding by means of

orthogonal and quasi-orthogonal dirty paper coding

b : data to be hiddenc : cover signal cw : transmitted signaln : attack

Goal: transmit the maximum amount of data for a given maximum c- cwenergy (maximum cover signal distortion) and a given attack energy, so that original data can be extracted from r without error.

cw

c n

rTransmitter

b

w : hidden data signal

Performance: in the Gaussian case (c and n i.i.d. Gaussian rvs) the maximum amount of data per transmitted sample is given by the Shannon bound:

w

c n

cw r

Transmitter

Channel encoder and modulator

b

• Exploitation of the cover signal that is known at the encoder:

• Recalling some consolidated results of information theory, i.e., Max H. M. Costa, “Writing on Dirty Paper”, IEEE Trans. Inf. Theory, May 1983, we have:

w

c n

cw r

Transmitter

Informed encoding and ambedding

b

c

blind approach

b

c

Informed approach

b

c

bcw : c- cw energy must be kept below a perceptive threshold

r : cw + n

Detector:

Robustness is crucial, transmission rate is in general less important

xN

x1

• In multiuser communications, users could be encoded such that encoding is based on the noncasual knowledge of the interference caused by other users.

• In G. Caire, et alii, “On the Achievable Throughput of a Multiantenna Gaussian Broadcast Channel”, IEEE Trans. Inf. Theory, July 2003 is shown that by exploiting the dirty paper concept, a multiantenna Gaussian Broadcast channel with single antenna non-cooperative receivers can achieve the same performance as MIMO systems where receivers are allowed to cooperate.

• In S. Sandeep Pradha et alii, “Duality Between Source Coding and Channel Coding and its Extension to the Side Information Case”, IEEE Trans. Inf. Theory, May 2003 it is discussed the duality between source coding with side information at the decoder and channel coding with side information at the encoder.

• A classical scenario of source coding with side information at the decoder is distributed source coding (e.g., sensors networks).

Exploiting the dirty paper concept, a distributed source coding scenario with non-cooperative transmitters can achieve the same performance as cooperative source coding.

S1H(S1)

S2

H(S2)-I(S1,S2)

H(S1,S2)

“1” bin

“0” bin

c

c“1” bin

“0” bin

b = 0

u

u/

c

c

“1” bin

“0” bin

b = 0

cw

(1- )c

“1” bin

“0” bin

b* = 0

cw

u/

r

• The ideal Costa scheme (ICS) is not practical due to the involved huge random codebook.

• The problem is that of finding a structured codebook to span the n

space, i.e., to find a structured vector quantization scheme (Quantization Index Modulation, QIM) for c.

• More specifically, a scalar version of QIM, Dither Modulation (DM) is usually adopted as shown in J. Eggers, et alii, “Scalar Costa Scheme for Information Embedding”, IEEE Trans. Signal Proc., April 2003.

• DM scheme is usually referred to as scalar Costa scheme (SCS):

cU1/

“1” bin

“0” bin

b = 0

cw

q=u/ -c

u/

w= q

• Upon receiving r=cw+ n, the receiver evaluates the soft estimate y :

r

“1” bin

“0” bin

u

y

Hard decision:

• Considering an error correcting encoding scheme to produce encoded data b, soft-input decoding algorithms can be used to decode from y the most likely transmitted message. Turbo Coded DM (TC-DM) performance shows a 4/5 dB loss with respect to ICS.

• In order to reduce the gap between ICS and DM, vector quantization must be considered.

• A practical way to do it is by means of dirty trellis coded modulation:

kt /q Convolutional

code

2q levels PAM mapper

u

k0 = information bitsk = aux. bitsRt = kt, kt = k0 +kR = k0

MUXk0

k

kt

• Informed coding is performed by decoding c over the dirty trellis:

c TCM DirtyViterbi decoder

k

Only 2k transitions among the possible 2kt

are considered (k0 bits are known)

MLSE sequence (minimum distance)

• Then auxiliary bits and information bits are used to compute u using the TCM encoder

• Then cw = u + (1- ) c is evaluated and transmitted (informed embedding)

• Decoding can be accomplished by standard Viterbidecoding

r TCM DirtyViterbi decoder

k0 + k

All the possible 2kt

transitions are considered

MLSE sequence

• In order to be effective, k and the number of states must be high, i.e., high complexity both at the encoder and at the decoder

• Due to their resemble to long random codes, serial concatenated dirty trellis structures are expected to approach ICS (i.e., to approach capacity), in the same way as turbo codes approach Shannon capacity.

• However, since informed coding requires the evaluation of the codeword u with minimum distance to c, iterative turbo decoding principles cannot be used to this goal.

• The only way to perform MLSE detection is by means of exhaustive search.

kt /m Outer Convolution

al code

2n levels PAM mapper

uMUX

k0

k

ktm /n Inner

Convolutionalcode

Interl.

• Some feasible concatenated TCM schemes have been proposed where a “weak” code without interleaving is considered for designing the codebook (source code) and a “strong” code with interleaving is considered for channel coding of informative bits:

• Moreover, due to the weak structure of the source coding, these schemes are not able to approach ICS (the gap is 2.5/3 dB)

• Up to now, DM with turbo coding is still considered the most attractive scheme due to its simplicity and good performance.

m/k0 Outer Convolutional code

2n levels PAM mapper

uMUX

m

ktkt/n Inner

Convolutionalcode

Interl.

k

k0

Channel codeSource code

• In many practical cases, the attack (channel) can be more realistically modelled by means of gain attack + additive noise

w

c n

cw r = g cw+ nInformed

encoding and ambedding

b

g

• DM and TCM approaches are sensitive to gain attacks since information is conveyed by amplitude.

• If the host feature space is scaled without informing the decoder, the hidden information is irremediably lost

g > 1

• Possible solutions to the gain attack include

– Estimation of g

– Use of equi-energetic codewords, i.e., spherical codes

cw

u/

gcw Since in ICS n tends to infinity, the codewords u / actually lies on a sphere, hence the ICS is still capacity achieving:

• Hence, in order to overcome gain attack, spherical codes for a finite n must be designed.

• For finite n, the energy of c is variable. When |c|2 is small, the radius of the codebook u / must be kept small, thus reducing the protection against additive noise (fading-like effect).

Low energy High energy

Fading – like scenario

Of course this effect is reduced as n increases

• High dimension spherical codes are very complex to be designed in general (this has to do with the problem of designing high dimension spherical lattices).

• The problem can be managed when small code rates are considered, e.g., ui = 1 -> Rt = kt/q.

• In this case classical binary linear codes can be considered (e.g., block, convolutional).

• However, while informed coding can be performed simply by MLSE decoding (e.g., over the dirty trellis), informed embedding becomes very complex.

kt /q Convolutional

code

2 levels PAM mapper

uMUXk0

k

kt

• Indeed, since decision region have in this case very irregular shape, Costa’s scheme for informed embedding is far to be optimum:

• When n is high, Costa’s scheme yields a loss of several dB with respect to the optimum embedding strategy

w (Costa’s scheme)

w (Optimum scheme)

c

• In M. L. Miller, G.J. Doerr, I. J. Cox, “Dirty-paper trellis codes for watermarking", ICIP 2002, September 2002 an iterative approach for informed embedding is proposed:

• In the case of dirty trellis codes, the iterative approach shows a very slow convergence to a sub-optimum solution. Moreover, it requires a very high implementation complexity (several thousands of iterations)

cw(n-1)

cw(n)

cw

• In A. Abrardo, M. Barni, “Orthogonal dirty paper coding for informed data hiding”, IS&T/SPIE’s 16-th Annual Symposium, 18-22 January 2004, we proposed to use orthogonal codes to ease the embedding process.

• Dirty coding can be seen as a two-step process:– Given bj, choose a codeword in Qj (say um): informed coding– Move c within the decoding (Voronoi) region associated to

um: informed embedding

t

0

t

t

k0

0jjk

ktk

2

k Q subsetsdisjoint 2 into dpartitione is

codewords of columns2

k2,matrix Unitary

R

Rnnn

i

i

t

bu

Uu

U

• The carrier codeword um is chosen in such a way to maxime the correlation with the cover feature sequence c (minimum distance)

iQ

mji

ucuu

maxarg

Qjc um

• The marked feature vector cw is determined by minimizing the embedding distortion for a given robustness R

• The robustness measure is defined as the maximum pairwiseerror probability between um and the codewords ui that do not belong to Qj, when an AWGN channel is assumed (Pe*).

• By standard digital communication arguments, the above constraint can be rewritten as:

2

2

10

*102

|

log10

2

1log102)(min

n

c

e

DNR

cqmwQq

DNR

SP

njq

uucu

• Due to the orthogonality of codewords we have:

• Then we need to calculate the embedding distortion. By exploiting again the orthogonality of codewords, we have:

jqmqqmqmqm

qmqmqmqmw

w

QSaaaa uuuc

uucuuwuucwuuc

Uawccw

,)(

)()()()()(:Constraint

;

2222?

:Distortion aUawcc

w

• Hence the optimum embedding problem can be reformulated as follows: find a such that:

jmqqm

Qqqm

QqSaa

tosubject

aajq

q,

|

22

|

minarg

u

au

jmqmq

Qqqm

QqSaa

tosubject

aajq

q,

|

22

|

minarg

u

au

jmqmq

Qqqm

am

QqSaa

aaajqm

q,

|

22

|},0min{

minarg

u

u

Optimum embedding reduces to a simple monodimensional problem

• The requirement that the codewords are orthogonal strongly limits the number of possible codewords

• This results in high distortion (i.e., low DWR) and a low bit rate

• In order to improve the performance we can relax the orthogonality constraint by adopting (quasi-orthogonal) Gold sequences:

• A higher DWR is obtained for a given robustness level

• The distance between the imposed robustness and the actual error rate increases slightly

• By relying on the simple structure of orthogonal and Gold sequences, low complexity schemes for robust data hiding can be derived.

• Such schemes achieve robustness against value-metric scaling and show performance that are approx. 2 dB far from ST-DM schemes.

• The proposed scheme is expected to work very well when watermarking of real data, namely still images and video, is of concern (due to property of performing embedding on short blocks of data, i.e., DCT blocks of JPEG/h263).

• The possibility of using more powerful spherical codes, such as laminated spherical codes, could reduce the gap with ST-DM, at the expenses of implementation complexity.

Documents

Andrea Abrardo - unipr.it