9
Higher-Order Statistics and Pairs Analysis CSM25 Secure Information Hiding Dr Hans Georg Schaathun University of Surrey Spring 2009 – Week 5 Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 1 / 39 Outcomes Understand the principle of higher-order statistics Be able to implement at least one steganalysis technique using higher-order statistics. Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 2 / 39 Reading Core Reading «Quantitative steganalysis of digital images: estimating the secret message length» by Jessica Fridrich, Miroslav Goljan, Dorin Hogea, David Soukal, in Multimedia Systems 2003 Suggested Reading «Higher-order statistical steganalysis of palette images» by Jessica Fridrich, Miroslav Goljan, David Soukal Suggested Reading Jessica Fridrich, Miroslav Goljan, and Rui Du (State University of New York, Binghamton) ‘Detecting LSB Steganography in Color and Gray-Scale Images’ in Multimedia and Security 2001 Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 3 / 39 Background Where χ 2 falls short The χ 2 test we have seen Analyses histogram only. Detects embedding in consecutive pixels What if the message is randomly spread across the image? Generalised χ 2 analysis. Yes/No answer; cannot estimate message length Can be fooled if the message is biased (more 0-s than 1-s or v.v.) Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 6 / 39

Outcomes Higher-Order Statistics and Pairs Analysis · 2009. 5. 5. · by Jessica Fridrich, Miroslav Goljan, Dorin Hogea, David Soukal, in Multimedia Systems 2003 Suggested Reading

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

  • Higher-Order Statistics and Pairs AnalysisCSM25 Secure Information Hiding

    Dr Hans Georg Schaathun

    University of Surrey

    Spring 2009 – Week 5

    Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 1 / 39

    Outcomes

    Understand the principle of higher-order statisticsBe able to implement at least one steganalysis technique usinghigher-order statistics.

    Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 2 / 39

    Reading

    Core Reading

    «Quantitative steganalysis of digital images:estimating the secret message length»by Jessica Fridrich, Miroslav Goljan, Dorin Hogea, David Soukal,in Multimedia Systems 2003

    Suggested Reading

    «Higher-order statistical steganalysis of palette images»by Jessica Fridrich, Miroslav Goljan, David Soukal

    Suggested Reading

    Jessica Fridrich, Miroslav Goljan, and Rui Du (State University of NewYork, Binghamton) ‘Detecting LSB Steganography in Color andGray-Scale Images’ in Multimedia and Security 2001

    Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 3 / 39

    Background

    Where χ2 falls short

    The χ2 test we have seenAnalyses histogram only.Detects embedding in consecutive pixels

    What if the message is randomly spread across the image?Generalised χ2 analysis.

    Yes/No answer; cannot estimate message lengthCan be fooled if the message is biased (more 0-s than 1-s or v.v.)

    Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 6 / 39

  • Background

    Higher-order statisticsPixels in neighbourhoods

    Pairs of Values counts single pixels→ first-order statistic

    Higher-order statisticsCount pairs of (neighbour) pixels (2nd order)Pixel triplets (3rd order)

    Study relations between pixels in a neighbourhood

    Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 7 / 39

    Pairs analysis

    Pairs Analysis

    Pairs Analysis is quantitativei.e. estimates the message length

    Originally designed for GIF.We present it for spatial, grayscale images.

    Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 9 / 39

    Pairs analysis The characteristic sequence

    The characteristic sequence

    Let c, c′ be two colours (grayscales).Read image row by row (left to right and top down).Assign 0 to c and 1 to c′.Ignore all other colours.Resulting sequence is denoted Z (c, c′).

    Definition

    Z = Z (0, 1)|Z (2, 3)|Z (4, 5)| . . . |Z (254, 255), (1)Z ′ = Z (1, 2)|Z (3, 4)|Z (5, 6)| . . . |Z (255, 0). (2)

    Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 10 / 39

    Pairs analysis The characteristic sequence

    The colour cut

    Z ( , ) extracted from an imageExtracted column-wise (Matlab-style)Row-wise extraction is equally valid.

    ����

    ��

    ��

    ���� ��

    ���� ��

    001111000000111111001101110010111001111110

    Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 11 / 39

  • Pairs analysis Homogenous pairs

    Second-order structure

    Second-order structure (of Z and of Z ′)count pairs of consecutive bitsfour possible pairs 00,01,10,11

    Homogenous pairs: 00, 11Let F be frequency of Homogenous pairs in Z .Let R = F/n be the relative frequency.

    where n = N ·M − 1 is the number of pairs.

    Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 12 / 39

    Pairs analysis Homogenous pairs

    Example

    0 0 0 1 1 0 1 1 1 1 0 1 1Homog. 1 1 0 1 0 0 1 1 1 0 0 1

    F = 7N = 11R = 7/11 = 0.6364

    Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 13 / 39

    Pairs analysis Homogenous pairs

    Expected structure of Z

    Let R(p) = E(R) beexpected, relative frequency of homogenous pairs in Zwhen a fraction p of pixel LSB-s have been flipped.(e.g. if a random unbiased bit string of length 2p has beenembedded)

    TheoremR(p) is a parabola with minimum at R(1/2) = 1/2.

    R(p) = ap2 + bp + c

    for some constants a, b, and c.

    Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 14 / 39

    Pairs analysis Homogenous pairs

    Why parabola?

    k1 k2 k3 k4 · · · krZ =

    ︷ ︸︸ ︷0000

    ︷︸︸︷111

    ︷ ︸︸ ︷00 . . . 0

    ︷ ︸︸ ︷11 . . . 1 · · ·

    ︷ ︸︸ ︷11 . . . 1

    nR(0) =∑r

    i=0(ki − 1)Homogenous pair remains homogenous: Pr = q2 + (1− q)2

    Both change + Neither changes

    Heterougenous pair becomes homogenous: Pr = 2q(1− q)

    nR(q) =r∑

    i=1

    [q2 + (1− q)2](ki − 1) + 2q(1− q)(r − 1)

    Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 15 / 39

  • Pairs analysis Homogenous pairs

    The R-function

    Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 16 / 39

    Pairs analysis Homogenous pairs

    Structure of the shifted pairs

    Compare the pairs Z with shifted pairs Z ′

    Two parabolas

    Assumption

    R′(p) are parabolic and symmetric around p = 1/2, i.e.

    R′(p) = a′p2 + b′p + c′.

    We will study D(p) = R(p)− R′(p).Difference between two parabolæ is a parabola

    D(p) = ap2 + bp + c for unknown a, b, p.

    Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 17 / 39

    Pairs analysis Homogenous pairs

    The R′-function

    Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 18 / 39

    Pairs analysis Homogenous pairs

    Structure of the shifted pairs (II)

    WriteZ ′ = b1, b2, b3, . . . , bn

    Theorem

    nR′(1/2) =n−1∑k=1

    2−khk ,

    where hk is number of homogenous pairs among

    (b1, bk+1), (b2, bk+2), . . . , (bn−k , bn).

    Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 19 / 39

  • Pairs analysis Homogenous pairs

    The estimate of R′(1/2)

    Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 20 / 39

    Pairs analysis Homogenous pairs

    Zero message assumption

    D(p) = ap2 + bp + c

    Assumption

    Z and Z ′ have the same structure when no message is embedded.

    R(0) = R′(0)D(0) = 0c = 0

    Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 21 / 39

    Pairs analysis Homogenous pairs

    Zero point

    Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 22 / 39

    Pairs analysis Homogenous pairs

    Symmetry

    Swapping all bits does not change the statisticSwapping 1− q random bits means

    Swapping all bits, and then (re)swap q random bits

    Embedding q or 1− q bits is the same thing.We conclude, D(q) = D(1− q)

    Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 23 / 39

  • Pairs analysis Homogenous pairs

    Symmetry

    Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 24 / 39

    Pairs analysis Homogenous pairs

    Solving a second-order equation

    We can estimate R and R′ at q = 12R( 12 )− R

    ′( 12 ) = D(12 ) = a/4 + b/2 (left side known)

    We exploit symmetry0 = D(q)− D(1− q) = (aq2 + bq)− (a(1− q)2 + b(1− q)).

    We solve for a and b, to get

    4D(12)q − 4D(1

    2)q2 = D(q)

    where D(12) and D(q) are knownSolving the quadratic estimate gives pEstimated message length is 2p

    Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 25 / 39

    Where Pairs Analysis fails Dithered backgrounds

    Dithered backgrounds

    Dithering is used to simulate additional coloursTwo colours c1 and c2 alternate over an area.The appearance would be a uniform colour somewhere inbetween.Colour cut (c1, c2) has many heterogenous pairs.The result is that R(0) 6= R′(0) and our assumption fails.This can be fixed by clever choice of colour cut.(I did not find any good examples.)

    Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 27 / 39

    RS steganalysis The idea

    RS steganalysis

    Proposed for true colour imagesUse information from all 8 bits of a pixelLinked to so-called lossless capacity

    Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 29 / 39

  • RS steganalysis The idea

    Pixel groups and smoothness

    Divide image into pixel groups G1, G2, . . .Disjoint groupsConsecutive pixels

    Define the smoothness of a group G = (x1, x2, . . . , xn)

    f (G) =n−1∑i=1

    |xi − xi+1|.

    High f (G) means sharp changes from pixel to pixel.Unusual for neighbour pixels in natural images

    Compare f (F (G)) and f (G)where F is flips pixels.

    Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 30 / 39

    RS steganalysis The idea

    Bit flipping

    Maps on a single pixelF+ : 2i ↔ 2i + 1F− : 2i ↔ 2i − 1F0 is the identity.

    Maps on a group, say of four, G = (x1, x2, x3, x4)F = [F0F+F+F0] = [0110]F (G) = (F0(x1), F1(x2), F1(x3), F0(x4))

    Shifted bit flip−F = [F0F−F−F0] = [0(−1)(−1)0]

    We will use one map F and the shift −F .

    Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 31 / 39

    RS steganalysis The idea

    Characteristic Groups

    The groupsRegular group: f (G) < f (F (G))Singular group: f (G) > f (F (G))Useless group: f (G) = f (F (G))

    The statisticsRF : number of regular groups under FSF : number of singular groups under FR−F : number of regular groups under −FS−F : number of singular groups under −F

    RF , R−F , SF , S−F as functions of pp is number of pixels flipped by the embedding.

    Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 32 / 39

    RS steganalysis The idea

    Statistics plot

    Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 33 / 39

  • RS steganalysis The result

    Approximations

    Approximations based on experimental investigationSF and RF are parabolicS−F and R−F are linear

    RF (p) = a1p2 + b1p + c1, (3)

    SF (p) = a2p2 + b2p + c2, (4)R−F (p) = a3p + b3, (5)S−F (p) = a4p + b4. (6)

    11 unknowns (10 coefficients and p)

    Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 34 / 39

    RS steganalysis The result

    Equations

    ObservationsRF (p), SF (p), R−F (p), S−F (p) from steganogramRF (1− p), SF (1− p), R−F (1− p), S−F (1− p) by flipping all LSB-s.

    AssumptionsRF (1/2) = SF (1/2)RF (0) = R−F (0)SF (0) = S−F (0).

    11 equationsWith 11 unknowns, this can be solved

    Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 35 / 39

    RS steganalysis The result

    The message length

    p =x

    x − 1/2,

    where x is the smaller root of

    s(d3 + d1)x2 + (d2 − d3 − d4 − 3d1)x + d1 − d2 = 0,

    where

    d1 = RF (p/2)− SF (p/2),d2 = R−F (p/2)− S−F (p/2),d3 = RF (1− p/2)− SF (1− p/2),d4 = R−F (1− p/2)− S−F (1− p/2).

    Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 36 / 39

    RS steganalysis The result

    Initial bias

    Some experiments show estimates within ±1% of true lengthSome images have an initial bias

    i.e. the cover image appear to have a short message.This must be taken into accountShort messages cannot be detected with certainty

    Gaussian distribution: µ = 0, σ = 0.5%Is it possible to estimate the initial bias?

    Plot from Fridrich et al.

    Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 37 / 39

  • RS steganalysis The result

    Example from Fridrich et al.

    Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 38 / 39

    RS steganalysis Counter-measures

    Good stego-systems?How do we foil higher-order statistics?

    Steganogram should resemble cover-imagenot necesserally visually... but statistically

    Statistics-aware steganographyDesigned for specific higher-order statisticsSteganogram resembles cover with respect to statisticStill ad hoc approach

    Dr Hans Georg Schaathun Higher-Order Statistics and Pairs Analysis Spring 2009 – Week 5 39 / 39

    BackgroundPairs analysisThe characteristic sequenceHomogenous pairs

    Where Pairs Analysis failsDithered backgrounds

    RS steganalysisThe ideaThe resultCounter-measures