Lie Gonzalez Ch8 ppt

Embed Size (px)

Citation preview

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    1/106

    CCU, Taiwan

    Wen-Nung Lie

    Chapter 8 : Image

    Compression

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    2/106

    8-1CCU, Taiwan

    Wen-Nung Lie

    Applications of image

    compression

    Televideo conferencing, Remote sensing (satellite imagery),

    Document and medical imaging,

    Facsimile transmission (FAX),

    Control of remotely piloted vehicles in military,

    space, and hazardous waste management

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    3/106

    8-2CCU, Taiwan

    Wen-Nung Lie

    Fundamentals

    What is data and hat is information ? Data are the means by which information is conveyed.

    Various amounts of data may be used to represent the

    same amount of information Data redundancy

    Coding redundancy

    Interpixel redundancy Psychovisual redundancy

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    4/1068-3CCU, Taiwan

    Wen-Nung Lie

    Coding redundancy

    The graylevel histogram of an image can provide a great

    deal of insight into the construction of codes

    The average number of bits to represent each pixel

    is the average number of bits to represent each value of

    Variable length coding (VLC) : Higher probability, shorter bit length

    ==

    1

    0)()(

    L

    k

    krkavg rprlL

    )( krl kr

    (code2)bits7.2

    (code1)bits0.3=avgL

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    5/106

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    6/1068-5

    CCU, Taiwan

    Wen-Nung Lie

    Run-length coding to remove

    spatial redundancy

    62.063.2

    11

    63.2

    ==

    =

    D

    R

    R

    C

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    7/1068-6

    CCU, Taiwan

    Wen-Nung Lie

    Psychovisual redundancy

    Certain information simply has less relative importance

    than others in normal visual processing psychovisual

    redundancy

    The elimination of psychovisually redundant data results

    in a loss of quantitative information called quantization

    lossy data compression E.g., quantization in graylevels

    E.g., line interlacing in TV

    (reduced video scanning rate)

    (b) False contouring on quantization

    (c) Improve graylevel quantizationvia dithering

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    8/1068-7

    CCU, Taiwan

    Wen-Nung Lie

    Fidelity criteria

    Objective fidelity criterion

    Root-mean-square error

    Mean-square signal-to-noise ratio

    Subjective fidelity criterion

    Much worse, worse, slightly worse, the same, slightly better, better,much better

    21

    1

    0

    1

    0

    2)],(),([1

    =

    =

    =

    M

    x

    N

    y

    rms yxfyxfMN

    e

    [ ]

    =

    =

    =

    =

    =

    1

    0

    1

    0

    2

    1

    0

    1

    0

    2

    ),(),(

    ),(

    M

    x

    N

    y

    M

    x

    N

    y

    msyxfyxf

    yxf

    SNR

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    9/106

    8-8CCU, Taiwan

    Wen-Nung Lie

    Image compression model

    Source encoder Remove input redundancies

    Channel encoder

    Increase the noise immunity of the source encodersoutput

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    10/106

    8-9CCU, Taiwan

    Wen-Nung Lie

    Source encoder model

    Mapper

    Transform the input data

    into a format designed to

    reduce interpixel

    redundancies in the image(reversible)

    Quantizer Reduce the accuracy of the mappers output (reduce the

    psychovisual redundancies) irreversible and must be omitted onerror-free compression

    Symbol encoder

    Fixed or variable-length coder (assign the shortest code words tothe most frequently occurring output to reduce codingredundancies) -- reversible

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    11/106

    8-10CCU, Taiwan

    Wen-Nung Lie

    Channel encoder and decoder

    Designed to reduce the impact of channel noise by

    inserting a controlled form of redundancy into the source

    encoded data

    Hamming code as a channel code

    7-bit Hamming (7,4) the minimum distance is 3 and could

    correct one bit error

    A single-bit error is indicated by a nonzero parity word c4c2c1

    07

    160124

    250132

    330231

    bh

    bhbbbh

    bhbbbh

    bhbbbh

    ===

    ==

    ==

    h1

    h2

    h4

    are evenparity bits

    76544

    76322

    75311

    hhhhc

    hhhhchhhhc

    =

    ==

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    12/106

    8-11CCU, Taiwan

    Wen-Nung Lie

    Information theory

    Explore the minimum amount of data that issufficient to describe completely an image without

    loss of information

    Self-information of an eventEwith probabilityP(E)

    )(log

    )(

    1log)( EP

    EP

    EI ==

    P(E)=1 I(E)=0

    P(E)=1/2 I(E)=1 bit

    The base of logarithmdetermines the information units

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    13/106

    8-12CCU, Taiwan

    Wen-Nung Lie

    Information channel

    Source alphabet

    Symbol probability

    Ensemble (A, z) describes the information source

    completely Average information per source output

    { }J

    aaaA ,...,,21

    = =

    =J

    j

    jaP

    1

    1)(

    [ ]TJaPaPaP )(),...,(),( 21=z

    ==

    J

    j

    jj aPaPH1

    )(log)()(z

    H(z) is the uncertainty or entropy of the source orthe average amount of information (in m-ary units)

    obtained by observing a single source output

    For equally probable source symbols, entropy is maximized

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    14/106

    8-13CCU, Taiwan

    Wen-Nung Lie

    Channel alphabet Channel description (B, v)

    { }K

    bbbB ,...,,21

    =

    [ ]TKbPbPbP )(),...,(),( 21=v

    =

    =J

    j

    jjkk aPabPbP1

    )()()(

    =

    )(......)()(

    ...........

    ...........

    ..........)(

    )(......)()(

    21

    12

    12111

    JKKK

    J

    abPabPabP

    abP

    abPabPabP

    Q

    Q : Forward channel transitionmatrix, channel matrix

    Qzv =

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    15/106

    8-14CCU, Taiwan

    Wen-Nung Lie

    Conditional entropy function

    is called the equivocation of z wrt v.

    The difference between and is the average

    information received upon observing a single output

    symbol -- mutual information

    ==

    J

    j

    kjkjk baPbaPbH1

    )(log)()(z

    =

    =K

    k

    kk bPbHH1

    )()()( zvz

    = ==J

    jkjkj

    K

    k baPbaPH 1 1 )(log),()( vz

    )( vzH

    )(zH )( vzH

    ),( vzI

    )()(),( vzzvz HHI =

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    16/106

    8-15CCU, Taiwan

    Wen-Nung Lie

    Mutual information

    The minimum possible value ofI(z,v) is zero and occurs

    when the input and output symbols are statistically

    independent (i.e., )

    The maximum value ofI(z,v) for all possible z is called the

    capacity Cdescribed by channel matrix Q

    = =

    =

    = =

    =

    =

    J

    j

    K

    k

    J

    i

    kii

    kj

    kjj

    J

    j

    K

    k kj

    kj

    kj

    qaP

    qqaP

    bPaP

    baPbaPvzI

    1 1

    1

    1 1

    )(

    log)(

    )()(

    ),(log),(),(

    Q={qij}

    )()(),( kjkj bPaPbaP =

    )],([max vzz

    IC=

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    17/106

    8-16CCU, Taiwan

    Wen-Nung Lie

    Channel capacity

    Capacity : the maximum rate at which information can be

    transmitted reliably through the channel

    Channel capacity does not depend on the input probability

    of the source (i.e., on how the channel is used) but is a

    function of the conditional probabilities defining thechannel alone (i.e., Q)

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    18/106

    8-17CCU, Taiwan

    Wen-Nung Lie

    Binary entropy function For binary source

    Binary symmetrical channel (BSC) error probability :

    channel matrix :

    probability of received symbols :

    mutual information :

    T

    bsbs pp ]1,[ =z

    bsbsbsbs ppppH 22 loglog)( =z

    5.0bit1)( == bsbs pwhenHMax

    ep

    =

    ee

    ee

    pp

    ppQ

    ++==

    bsebse

    bsebse

    ppppppppQzv

    )()(),( ebsebsebsbs pHppppHI +=vz

    Plot_1

    Plot_2

    I(z,v)=0 whenpbs=0 or 1

    I(z,v)=1-Hbs(pe)=Cwhenpbs=0.5 and anype

    To estimate source entropy

    To estimate channel capacity

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    19/106

    8-18CCU, Taiwan

    Wen-Nung Lie

    whenpe=0 or 1 C=1 bit

    whenpe=0.5 C=0 bit (no information transmitted)

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    20/106

    8-19CCU, Taiwan

    Wen-Nung Lie

    Shannons first theorem for

    noiseless coding Define the minimum average code length per source

    symbol that can be achieved

    For a zero-memory source (with finite ensemble (A, z) and

    statistically independent source symbols)

    Single symbol :

    Block symbol : (n-tuple of symbols)

    },...,,{ 21 JaaaA=},....,,{ 21 nJA =

    )()...()()(21 jnjji

    aPaPaPP =

    )(),( zz HA)(),( zz HA

    =

    =n

    J

    i

    ii PPH1

    )(log)()( z

    )()( zz nHH = Entropy of zero-memorysource with block symbols

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    21/106

    8-20CCU, Taiwan

    Wen-Nung Lie

    N-extended source Use n symbols as a block symbol

    Average word length of the n-extended code can be

    = =

    ==

    n nJ

    i

    J

    i i

    iiiavgP

    PlPL1 1 )(

    1log)()()(

    1)()( +

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    22/106

    8-21CCU, Taiwan

    Wen-Nung Lie

    Coding efficiency Coding efficiency :

    Example :

    Original source :

    Second extension source :

    avgL

    zHn

    =

    )(

    3/1)(,3/2)( 21 == aPaP

    }9/1,9/2,9/2,9/4{=z

    symbolbitsLavg /89.19/17 ==

    symbolbitszH /83.1)( =

    97.089.1/83.1 ==

    symbolbitLavg /1= 918.01/918.0 ==

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    23/106

    8-22CCU, Taiwan

    Wen-Nung Lie

    Shannons second theorem

    for noisy coding For anyR < C (channel capacity of the zero-memory

    channel with Q), there exists codes of block length rand

    rateR such that the block decoding error can be arbitrarily

    small

    Rate distortion function

    given a distortionD, the minimum rate at which

    information about the source can be conveyed to the user

    )],([min)( vzDQQ IDR =

    })({ Ddqkj = QQD

    ncompressiobysymbolsoutputtosymbolssourcefromyprobabilitthe:Q

    d(Q) : average distortion

    D: distortion threshold

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    24/106

    8-23CCU, Taiwan

    Wen-Nung Lie

    Example of Rate-distortion

    function Consider a zero-memory binary source with equally

    probable source symbols {0,1}

    R(D) is monotonically decreasing and convex in the interval (0,Dmax)

    Shape of R-D function represents the coding efficiency of a coder

    )(1)( DHDR bs=

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    25/106

    8-24CCU, Taiwan

    Wen-Nung Lie

    Estimate the information

    content (entropy) of an image Construct a source model based on the relative frequency

    of occurrence of the graylevels

    First-order estimate (consider histogram of single pixels) : 1.81

    bits/pixel

    Second-order estimate (consider histogram of pairs of adjacentpixels) : 1.25 bits/pixel

    Third-order, fourth-order : approaching the source entropy

    21

    21

    21

    21

    21

    21

    21

    21

    21

    21

    21

    21

    95

    95

    95

    95

    169

    169

    169

    169

    243

    243

    243

    243

    243

    243

    243

    243

    243

    243

    243

    243

    Example image

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    26/106

    8-25CCU, Taiwan

    Wen-Nung Lie

    The difference between the higher-order estimates of entropy and

    the first-order estimate indicates the presence of interpixelredundancy

    If the pixels are statistically independent, the higher-order

    estimates are equivalent to the first-order estimate and variable

    length coding (VLC) provides optimal compression (i.e., no further

    mapping is necessary)

    First-order estimate of difference mapped source : 1.41 bits/pixel

    1.41 > 1.25 there is an even better mapping

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    27/106

    8-26CCU, Taiwan

    Wen-Nung Lie

    Error-free compression --

    variable-length coding (VLC) Huffman coding

    the most popular method to yield the smallest possible number of

    code symbols per source symbol

    construct the Huffman tree according to source symbol

    probabilities Code the Huffman tree

    Compute the source entropy, average code length, and code

    efficiency

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    28/106

    8-27CCU, Taiwan

    Wen-Nung Lie

    Huffman code is :

    a block code (each symbol is mapped to a fixed sequence of bits)

    instantaneous (decoded without referencing succeeding symbols)

    uniquely decodable (any code word is not a prefix of another)

    for example : 010100111100 a3 a1 a2 a2 a6

    symbolbitsLavg /2.2= 973.0=H(z)=2.14 bits/symbol

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    29/106

    8-28CCU, Taiwan

    Wen-Nung Lie

    Variations of VLC

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    30/106

    8-29CCU, Taiwan

    Wen-Nung Lie

    Variations of VLC (cont.) Considered when a large number of source symbols are

    considered Truncated Huffman coding

    Only partial symbols (with most probabilities, here the top 12) are

    encoded with Huffman code A prefix code (whose probability was the sum of other symbols,

    here 10) followed with a fixed-length code is used to representall the other symbols

    B-code Each code word is made up of continuationbits (C) and

    informationbits (natural binary numbers)

    Ccan be 1 or 0, but alternative

    E.g., a11a2a7 001010101000010 or 101110001100110

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    31/106

    8-30CCU, Taiwan

    Wen-Nung Lie

    Variations of VLC (cont.) Binary shift code

    Divide the symbols into blocks of equal size

    Code the individual elements within all blocks identically

    Add shift-up or shift-down symbols to identify each block (here,

    111) Huffman shift code

    Select one reference block

    Sum up probabilities of all other blocks and use it to determine theshift symbol by Huffman method (here, 00)

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    32/106

    8-31CCU, Taiwan

    Wen-Nung Lie

    Arithmetic coding AC generates non-block codes (no

    look-up table as in Huffman code) An entire sequence of source symbols

    is assigned a single code word

    The code word itself defines an

    interval of real numbers between 0 and

    1. As the number of symbols in the

    message increases, the interval

    becomes smaller (more information

    bits to represent this real number)

    Multiple symbols are integratedly

    coded with a set of bits number of

    bits per symbol is effectivelyfractional

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    33/106

    8-32CCU, Taiwan

    Wen-Nung Lie

    Arithmetic coding (cont.) Example : coding of a1a2a3a3a4

    Any real number between [0.06752, 0.0688] can representthe source symbols (e.g., 0.000100011=0.068359375)

    Theoretically, 0.068 is enough to represent the source

    symbol 3 decimal digits 0.6 decimal digits persymbol Actually, 0.068359375 9 bits for 5 symbols 1.8 bits

    per symbol

    As the length of the source symbol sequence increases, theresulting arithmetic code approaches the Shannons bound

    Two factors limit the coding performance

    An end-of-message indicator is necessary to separate onemessage sequence from another

    The finite precision arithmetic

    LZW (L l Zi W l h)

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    34/106

    8-33CCU, Taiwan

    Wen-Nung Lie

    LZW (Lempel-Ziv-Welch)

    coding Assign fixed-length code words to variable-length

    sequences of source symbols but require no a prioriknowledge of the probabilities of occurrence of symbols

    LZW compression has been integrated into GIF, TIFF, and

    PDF formats For a 9-bit (512 words) dictionary, the latter half entries

    are constructed in the encoding process. An entry of two ormultiple pixels can be possible (i.e., assigned with a 9 bits

    code) If the size of the dictionary is too small, the detection of matching

    gray-level sequences will be less likely.

    If too large, the size of code words will adversely affectcompression performance

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    35/106

    8-34CCU, Taiwan

    Wen-Nung Lie

    LZW coding (cont.) The LZW decoder builds an

    identical decompressiondictionary as it decodes

    simultaneously the bit

    stream

    Notice to handle the

    dictionary overflow

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    36/106

    8-35CCU, Taiwan

    Wen-Nung Lie

    Bit-plane coding Bit-plane decomposition

    Binary : bit change between two adjacent codes may besignificant (e.g., 127 and 128)

    Gray code : only 1 bit is changed between any two

    adjacent codes

    Gray-coded bit planes are less complex than the

    corresponding binary bit planes

    0

    0

    1

    1

    2

    2

    1

    1 22...22 aaaa m

    m

    m

    m ++++

    2011

    1

    = = +

    miagaag

    mm

    iii

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    37/106

    8-36CCU, Taiwan

    Wen-Nung LieBinary-coded gray-coded

    Lossless compression for

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    38/106

    8-37CCU, Taiwan

    Wen-Nung Lie

    Lossless compression for

    binary images 1-D run-length coding

    RLC+VLC according to run-lengths statistics

    2-D run-length coding

    used for FAX image compression Relative address coding (RAC)

    based on the principle of tracking the binary transitions that

    begin and end each black and white run combined with VLC

    Other methods for lossless

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    39/106

    8-38CCU, Taiwan

    Wen-Nung Lie

    Other methods for lossless

    compression of binary images White block skipping

    code the solid lines as 0s and all other lines with a 1 followed by

    original bit patterns

    Direct contour tracing

    represent each contour by a single boundary point and a set ofdirectionals

    Predictive differential quantizing (PDO)

    a scan-line-oriented contour tracing procedure Comparisons of various algorithms

    only entropies after pixel mapping are computed, instead of real

    encoding (see next slice)

    Comparison of various

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    40/106

    8-39CCU, Taiwan

    Wen-Nung Lie

    Comparison of various

    methods Run-length coding proved to

    be the best coding method for

    bit-plane coded images

    2-D techniques (PDQ, DDC,

    RAC) perform better when

    compressing binary images orhigher bit-plane images

    Gray-coded images proved to

    gain additional 1 bit/pixel

    compression efficiency relativeto binary-coded images

    For binary image

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    41/106

    8-40CCU, Taiwan

    Wen-Nung Lie

    Lossless predictive coding Eliminating the interpixel redundancies of closely spaced pixels

    by extracting and coding only the new information (thedifference between the actual and predicted value) in each pixel

    System architecture

    the same predictor in the encoder and decoder sides rounding the predicted value

    RLC symbol encoder

    nnn ffe

    =

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    42/106

    8-41CCU, Taiwan

    Wen-Nung Lie

    Methods of prediction Use local, global, or adaptive predictor to generate

    linear prediction : linear combination of mprevious pixels

    m : order of prediction

    : prediction coefficients

    Use local neighborhoods (e.g., pixels-1,2,3,4) forprediction of pixel-X in 2-D images

    special case : previous pixel predictor

    nf

    =

    =

    m

    i

    inin froundf1

    i

    2 3 4

    1 X

    Entropy reduction by

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    43/106

    8-42CCU, Taiwan

    Wen-Nung Lie

    Entropy reduction by

    difference mapping

    e

    e

    e

    e eep

    2

    2

    1)(

    =

    Due to removal of interpixel redundancies by prediction, first-

    order entropy of difference mapping will be lower than theoriginal image (3.96 bits/pixel vs. 6.81 bits/pixel)

    The probability density function of the prediction errors is

    highly peaked at zero and characterized by a relatively smallvariance (modeled by the zero mean uncorrelated Laplacian pdf)

    e

    e eep =

    2)(

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    44/106

    8-43CCU, Taiwan

    Wen-Nung Lie

    Lossy predictive coding Error-free encoding of images seldom results in more than

    3:1 reduction in data

    A lossy predictive coding model

    the prediction in the encoder and decoder must be equivalent (same)

    -- placing encoders predictor within a feedback loop

    nnn fef+= &&

    This closed loop

    configuration preventserror buildup atdecoders output

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    45/106

    8-44CCU, Taiwan

    Wen-Nung Lie

    Delta modulation (DM) DM :

    the resulting bit rate is 1 bit/pixel

    Slope overload effect

    is too small to represent inputs large changes, lead to blurredobject edges

    Granular noise effect

    is too large to represent inputs small change, lead to grainy ornoisy surfaces

    1

    = nn ff &

    >+=

    otherwise

    ee

    n

    n

    0for

    &

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    46/106

    8-45CCU, Taiwan

    Wen-Nung Lie

    Delta modulation (cont)

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    47/106

    8-46CCU, Taiwan

    Wen-Nung Lie

    Optimal predictors Minimize encoders mean-square prediction error

    with

    select mprediction coefficients to minimize

    The solution :

    }]{[}{ 22 nnn ffEeE =

    nnnnnn ffefef =++= & =

    =m

    i

    inin ff1

    DPCM

    =

    =

    2

    1

    2}{m

    i

    ininn ffEeE

    rR1=

    =

    }{...}{

    .....

    .....

    ...}{}{

    }{..}{}{

    1

    2212

    12111

    mnmnnmn

    nnnn

    mnnnnnn

    ffEffE

    ffEffE

    ffEffEffE

    R

    =

    }{

    .

    .

    }{

    }{

    2

    1

    mnn

    nn

    nn

    ffE

    ffE

    ffE

    r

    =

    m

    .

    .

    2

    1

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    48/106

    8-47CCU, Taiwan

    Wen-Nung Lie

    Optimal predictors (cont) Variance of prediction error

    For a 2-D Markov source with separable autocorrelation

    function

    and 4-th order linear predictor

    We generally have

    to reduce the DPCM decoders susceptibility totransmission noise

    = ==m

    i

    iinnT

    e ffE1

    222 }{ r

    j

    h

    i

    vjyixfyxfE 2)},(),({ =

    )1,1(),1()1,1()1,(),( 4321 ++++= yxfyxfyxfyxfyxf

    0,,, 4321 ==== vhvh

    2 1

    3 X

    4

    0.11

    =

    m

    i

    i

    Confines the impact of an input error to asmall number of outputs

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    49/106

    8-48CCU, Taiwan

    Wen-Nung Lie

    Examples of predictors Four examples :

    The visually perceptible error decreases as the order of the

    predictor increases

    =

    +=

    +=

    =

    otherwise),1(97.0)1,(97.0),(

    )1,1(5.0),1(75.0)1,(75.0),(

    ),1(5.0)1,(5.0),(

    )1,(97.0),(

    yxfvhifyxfyxf

    yxfyxfyxfyxf

    yxfyxfyxf

    yxfyxf

    Adaptive with local measure ofdirectional property

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    50/106

    8-49CCU, Taiwan

    Wen-Nung Lie

    Optimal quantization The staircase quantization function is an odd function that

    can be described by the decision ( ) and reconstruction ( )levels

    Quantizer design is to select the best and for a

    particular optimization criterion and input probabilitydensity functionp(s)

    is it

    is it

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    51/106

    8-50CCU, Taiwan

    Wen-Nung Lie

    L-level Lloyd-Max quantizer Optimal in the mean-square error sense

    2,...,2,10)()(

    1

    Ls

    s i idsspts

    i

    i

    ==

    = =

    =

    =

    ++

    2

    22

    1,...,2,1

    00

    1

    L

    Ltt

    i

    ii

    i

    s

    ii

    iiii ttss == ,

    ti is the centroid ofeach decision interval

    Decision levels are halfway

    of the reconstruction levels

    Non-uniformquantizer

    How about an optimal

    uniform quantizer ?

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    52/106

    8-51CCU, Taiwan

    Wen-Nung Lie

    Transform coding Subimage decomposition

    Transformation

    Decorrelate the pixels of each subimage

    Energy packing

    Quantization Eliminate coefficients that carry the least information

    coding

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    53/106

    8-52CCU, Taiwan

    Wen-Nung Lie

    Transform selection Requirements

    orthonormal or unitary forward and inverse transformation kernels

    basis function or basis images

    separable and symmetric for the kernels

    Types DFT

    DCT

    WHT (Walsh-Hadamard transform) Compression is achieved during the quantization of the

    transformed coefficients (not during the transformation

    step)

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    54/106

    8-53CCU, Taiwan

    Wen-Nung Lie

    WHT The summation in the exponent is performed in modulo 2

    arithmetic is thekth bit (from right to left) in the binary

    representation ofz

    ==

    =

    +1

    0

    )()()()(

    )1(1

    ),,,(),,,(

    m

    i

    iiii vpybupxb

    N

    vuyxhvuyxg

    mN 2=

    )(zbk

    =

    =

    =1

    0

    1

    0

    ),,,(),(),(N

    x

    N

    y

    vuyxgyxfvuT

    =

    =

    =1

    0

    1

    0

    ),,,(),(),(N

    u

    N

    v

    vuyxhvuTyxf

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    55/106

    8-54CCU, Taiwan

    Wen-Nung Lie

    DCT

    +

    +==

    N

    vy

    N

    uxvuvuyxhvuyxg

    2

    )12(cos

    2

    )12(cos)()(),,,(),,,(

    =

    ==

    1,...,2,1

    0)(

    2

    1

    Nufor

    uforu

    N

    N

    Approximations using DFT,

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    56/106

    8-55CCU, Taiwan

    Wen-Nung Lie

    pp g ,

    WHT, and DCT Truncating 50% of the

    transformed coefficientsbased on maximum

    magnitudes

    The RMS error values are1.28 (DFT), 0.86 (WHT), and

    0.68 (DCT).

    DFT

    WHT

    DCT

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    57/106

    8-56CCU, Taiwan

    Wen-Nung Lie

    Basis images A linear combination of basis images :

    =

    =

    =1

    0

    1

    0

    ),(n

    u

    n

    v

    uvvuT HF

    =

    ),,1,1(..),,1,1(),,0,1(

    .....

    .....

    ....),,0,1(

    ),,1,0(..),,1,0(),,0,0(

    vunnhvunhvunh

    vuh

    vunhvuhvuh

    uvH

    uvH2n

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    58/106

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    59/106

    8-58CCU, Taiwan

    Wen-Nung Lie

    Subimage size selection The most popular subimage sizes are 88 and 1616.

    A comparison of nn subimage for n=2,4,8,16 Truncating 75% of resulting coefficients

    Curves of WHT and DCT flatten as size of subimage becomes greater

    than 88, while FT does not Block effect decreases when subimage size increases

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    60/106

    8-59CCU, Taiwan

    Wen-Nung Lie

    original 2x2

    subimage

    4x4

    subimage

    8x8

    subimage

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    61/106

    8-60CCU, Taiwan

    Wen-Nung Lie

    Zonal vs. threshold coding The coefficients are retained based on

    Maximum variance : zonal coding Maximum magnitude : threshold coding

    Zonal coding

    Each DCT coefficient is considered as a random variable whose

    distribution could be computed over the ensemble of all transformedsubimages

    The variances can also be based on an assumed image model (e.g., aMarkov autocorrelation function)

    Coefficients of maximum variances are usually located around the origin

    Threshold coding

    Select coefficients which have the largest magnitudes

    Causing far less error than the zonal coding result

    Bit allocation for

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    62/106

    8-61CCU, Taiwan

    Wen-Nung Lie

    zonal coding The retained coefficients are

    allocated bits according to : The dc coefficient is modeled by

    a Rayleigh density function

    The ac coefficients are often

    modeled by a Laplacian or

    Gaussian density

    The number of bits is made

    proportional to

    Zonal

    coding

    Threshold

    coding

    2),(2log vuT

    The information content of a Gaussian

    random variable is proportional to )/(log2

    2 D

    Bit allocation in threshold

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    63/106

    8-62CCU, Taiwan

    Wen-Nung Lie

    coding The transform coefficients of largest magnitude make the

    most significant contribution to reconstructed subimagequality

    Inherently adaptive in the sense that the location of the

    retained transform coefficients vary from one subimage toanother

    Retained coefficients are recorded in a 1-D zigzag ordering

    pattern The mask pattern is run-length coded

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    64/106

    8-63CCU, Taiwan

    Wen-Nung Lie

    Zonal mask Bit allocation

    Thresholded and

    retained coefficients

    Zigzag ordering pattern

    Bit allocation in threshold

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    65/106

    8-64CCU, Taiwan

    Wen-Nung Lie

    coding (cont) Thresholding the transform coefficients -- the threshold

    can be varied as a function of the location of eachcoefficient within the subimage

    Result in a variable code rate, but offer the advantage that

    thresholding and quantization can be combined by replacing the

    masking (i.e., ) with

    The elementZ(u, v) can be scaled to achieve a variety of

    compression levels

    ),(),( vuTvu

    ]),(

    ),([),(

    vuZ

    vuTroundvuT =

    Z(u,v) is the transform normalization array

    ),(),(),( vuZvuTvuT =& Recovered or denormalized value

    Normalization (quantization)

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    66/106

    8-65CCU, Taiwan

    Wen-Nung Lie

    matrix Used in JPEG standard

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    67/106

    8-66CCU, Taiwan

    Wen-Nung Lie

    Wavelet coding Requirements in encoding an image

    An analyzing wavelet Number of decomposition levels

    The digital wavelets transform (DWT) converts a largeportion of the original image to horizontal, vertical anddiagonal decomposition coefficients with zero mean andLaplacian-like distributions.

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    68/106

    8-67CCU, Taiwan

    Wen-Nung Lie

    Wavelet coding (cont) Many computed coefficients can be quantized and coded to

    minimize intercoefficient and coding redundancy The quantization can be adapted to exploit any positional

    correlation across different decomposition levels

    Subdivision of the original image is unnecessary

    Eliminating the blocking artifact that characterizes DCT-based

    approximations at high compression ratios

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    69/106

    8-68CCU, Taiwan

    Wen-Nung Lie

    1-D Subband decomposition Analysis filter :

    Low-pass : approximation High-pass : detail

    Synthesis filter :

    Low-pass : High-pass :

    )(0 ng)(1 ng

    )(1 nh

    )(0 nh

    2-D Subband image coding

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    70/106

    8-69CCU, Taiwan

    Wen-Nung Lie

    4 filter bank

    An example of Daubechies

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    71/106

    8-70CCU, Taiwan

    Wen-Nung Lie

    orthonormal filters

    Comparison between DCT-

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    72/106

    8-71CCU, Taiwan

    Wen-Nung Lie

    based and DWT-based coding A noticeable decrease of error in the wavelet coding results (based on

    compression ratios of 34:1 and 67:1)

    DWT-based DCT-based

    Rms error :

    2.29, 2.96

    Rms error :

    3.42, 6.33

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    73/106

    8-72CCU, Taiwan

    Wen-Nung Lie

    Rms error :

    3.72, 4.73

    Based on compression ratios of

    108:1 and 167:1 Even the 167:1 DWT result

    (rms=4.73) is better than 67:1

    DCT result (rms=6.33)

    At more than twice ofcompression ratio, the wavelet-

    based reconstruction has only

    75% of the error of the DCT-

    based result

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    74/106

    8-73CCU, Taiwan

    Wen-Nung Lie

    Wavelet selection The wavelet selection affects all aspects of wavelet coding

    system design and performance Computational complexity of the transform

    Systems ability to compress and reconstruct images of acceptable

    error

    Include the decomposition filters and reconstruction filters

    Useful analysis property : number of zero moments

    Important synthesis property : smoothness of reconstruction

    The most widely used expansion functions Daubechies wavelets

    Bi-orthogonal waveletsAllows filters with binary coefficients (numbers of theform k/2a)

    Comparison between different

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    75/106

    8-74CCU, Taiwan

    Wen-Nung Lie

    wavelets (1) Harr wavelets : the

    simplest

    (2) Daubechies wavelets :

    the most popular imaging

    wavelets

    (3) Symlets : an extensionof Daubechies with

    increased symmetry

    (4) Cohen-Daubechies-

    Feauveau wavelets :biorthogonal wavelets

    1,2

    3,4

    Comparison between different

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    76/106

    8-75CCU, Taiwan

    Wen-Nung Lie

    wavelets (cont) Comparisons in number of operations required

    As the computational complexity increases, the informationpacking ability does as well

    For analysis and reconstruction filters, respectively

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    77/106

    8-76CCU, Taiwan

    Wen-Nung Lie

    Other issues The number of decomposition levels required

    The initial decompositions are responsible for the majority of the datacompression (3 levels is enough)

    Quantizer design

    Introduce an enlarged quantization interval around zero (i.e., the deadzone)

    Adapting the size of the quantization interval from scale to scale

    Quantization of coefficients at more decomposition levels impacts largerareas of the reconstructed image

    Binary image compression

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    78/106

    8-77CCU, Taiwan

    Wen-Nung Lie

    standards Most of the standards are issued by

    International Standardization Organization (ISO)

    Consultative Committee of the International Telephone and Telegraph(CCITT)

    CCITT Group 3 and 4 are for binary image compression

    Originally designed as fasimile (FAX) coding method

    G3 : Nonadaptive, 1-D run-length coding

    G4 : a simplified or streamlined version of G3, only 2-D coding

    The coding approach is quite similar to the RAC method

    Joint Bilevel Imaging Group (JBIG)

    A joint committee of CCITT and ISO

    Proposed JBIG1: adaptive arithmetic compression technique (the bestaverage and worst-case available)

    Proposed JBIG2 : achieve compressions 2 to 4 times greater than JBIG1

    Continuous-tone image

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    79/106

    8-78CCU, Taiwan

    Wen-Nung Lie

    compression -- JPEG JPEG, wavelet-based JPEG-2000, JPEG-LS

    JPEG define three coding system Lossy baseline coding system

    Extended coding system for greater compression, higherprecision, or progressive reconstruction applications

    Lossless independent coding system

    A product or system must include support for the baseline system

    Baseline system

    Input, output : 8 bits Quantized DCT values : 11 bits

    Subdivided into blocks of 8x8 pixels for encoding

    Pixels are level shifted by substracting 128 graylevels

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    80/106

    8-79CCU, Taiwan

    Wen-Nung Lie

    JPEG (cont) DCT-transformed

    Quantized by using the quantization or normalization matrix The DC DCT coefficients is difference coded relative to the DC

    coefficient of the previous block

    The non-zero AC coefficients are coded by using a VLC that

    defines the coefficients value and number of preceding zeros The user is free to construct custom tables and/or arrays, which may

    be adapted to the characteristics of the image being compressed

    Add a special EOB (end of block) code behind the last non-zero

    AC coefficient

    Huffman Coding for the DC

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    81/106

    8-80CCU, Taiwan

    Wen-Nung Lie

    and AC coefficients VLC = base code (category code) + value code

    An example to encode DC = -9

    Category 4 : -15,,-8,8,,15 (base code = 101)

    Total length = 7 value code length = 4

    For a category K, additional Kbits are needed as the value code andcomputed as either the KLSBs (positive value) or the KLSBs of 1scomplement of its absolute value (negative value) : K=4, value code =0110

    Complete code : 101-0110

    An example to encode AC = (0,-3)

    length of 0 run : 0, AC coefficient value : -3

    AC = -3 category = 2 0/2 (Table 8.19)base code = 01

    Value code = 00

    Complete code = 0100

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    82/106

    8-81CCU, Taiwan

    Wen-Nung Lie

    A JPEG coding example52 55 61 66 70 61 64 73

    63 59 66 90 109 85 69 72

    62 59 68 113 144 104 66 73

    63 58 71 122 154 106 70 69

    67 61 68 104 126 88 68 70

    79 65 60 70 77 68 58 75

    85 71 64 59 55 61 65 8387 79 69 68 65 76 78 94

    -76 -73 -67 -62 -58 -67 -64 -55

    -65 -69 -62 -38 -19 -43 -59 -56

    -66 -69 -60 -15 16 -24 -62 -55

    -65 -70 -57 -6 26 -22 -58 -59

    -61 -67 -60 -24 -2 -40 -60 -58

    -49 -63 -68 -58 -51 -65 -70 -53

    -43 -57 -64 -69 -73 -67 -63 -45-41 -49 -59 -60 -63 -52 -50 -34

    Level

    shiftedoriginal

    -415 -29 -62 25 55 -20 -1 3

    7 -21 -62 9 11 -7 -6 6

    -46 8 77 -25 -30 10 7 -5

    -50 13 35 -15 -9 6 0 3

    11 -8 -13 -2 -1 1 -4 1

    -10 1 3 -3 -1 0 2 -1

    -4 -1 2 -1 2 -3 1 -2

    -1 1 -1 -2 -1 -1 0 -1

    DCT

    transformed

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    83/106

    8-82CCU, Taiwan

    Wen-Nung Lie

    -26 -3 -6 2 2 0 0 0

    1 -2 -4 0 0 0 0 0

    -3 1 5 -1 -1 0 0 0

    -4 1 2 -1 0 0 0 0

    1 0 0 0 0 0 0 0

    0 0 0 0 0 0 0 0

    0 0 0 0 0 0 0 0

    0 0 0 0 0 0 0 0

    quantized

    1-D Zigzag scanning : [-26 3 1 3 2 6 2 4 1 4 1 1 5 0 2 0 0 1 2 0 0 0 0 0 1 1 EOB]

    Symbols to be coded : (DPCM =-9, assumed),(0,-3),(0,1),(0,-3),(0,-2),(0,-6),(0,2),(0,-4),(0,1),(0,-4),(0,1),(0,1),(0,5),(1,2),(2,-1),(0,2),(5,-1),(0,-1),EOB

    Final codes :

    1010110,0100,001,0100,0101,100001,0110,100011,001,100011,001,001,100101,1110011

    0,110110,0110,11110100,000,1010

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    84/106

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    85/106

    8-84CCU, Taiwan

    Wen-Nung Lie

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    86/106

    8-85CCU, Taiwan

    Wen-Nung Lie

    JPEG 2000

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    87/106

    8-86CCU, Taiwan

    Wen-Nung Lie

    JPEG-2000 Increased flexibility to the access of compressed data

    Portions of a JPEG 2000 compressed image can be extracted forre-transmission, storage, display, and editing

    Procedures

    DC level shift by substracting 128

    Optionally decorrelated by using reversible or non-reversible linear

    combination of components (e.g., RGB to YCrCb component

    transformation) (i.e., RGB-based or YCrCb-based JPEG-2000

    coding are allowed)

    ),(08131.0),(41869.0),(5.0),(

    ),(5.0),(33126.0),(16875.0),(

    ),(114.0),(587.0),(299.0),(

    2102

    2101

    2100

    yxIyxIyxIyxY

    yxIyxIyxIyxY

    yxIyxIyxIyxY

    =

    +=

    ++=

    Y1 and Y2 are different images highly peaked around zero

    JPEG 2000 ( t)

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    88/106

    8-87CCU, Taiwan

    Wen-Nung Lie

    JPEG-2000 (cont) Optionally divided into tiles (rectangular arrays of pixels which

    can be extracted and reconstructed independently), providing a

    mechanism for accessing a limited region of a coded image

    2-D DWT by using a biorthogonal 5-3 coefficient filter (lossless)

    or a 9-7 coefficient filter (lossy) produces a low-resolution

    approximation and horizontal, vertical, and diagonal frequencycomponents (LL, HL,LH, HH bands)

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    89/106

    8-88CCU, Taiwan

    Wen-Nung Lie

    9-7 filter for lossy DWT

    i H0 H1

    0 6/8 11 2/8 -1/2

    2 -1/8

    5-3 filter for lossless DWT

    JPEG 2000 ( t)

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    90/106

    8-89CCU, Taiwan

    Wen-Nung Lie

    JPEG-2000 (cont) A fast DWT algorithm or a lifting-basedapproach is often used

    Iterative decomposition of the approximation part to obtain :

    Important visual information is concentrated in a few coefficients

    Difference from subbanddecomposition

    The standard does not specify

    the number of scales (i.e., thenumber of decomposition

    levels) to be computed

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    91/106

    8-90CCU, Taiwan

    Wen-Nung Lie

    JPEG 2000 ( t)

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    92/106

    8-91CCU, Taiwan

    Wen-Nung Lie

    JPEG-2000 (cont) Coefficient quantization adapted to individual scales and subbands

    The number of exponent and mantissa bits should be provided tothe decoder on a subband basis, called explicit quantization, or forthe LL subbands only, called implicit quantization (the remainingsubbands are quantized using extrapolated LL subband parameters)

    ]),(

    [)],([),(b

    bbb

    vuafloorvuasignvuq

    =

    )2

    1(2:sizesteponquantizati11

    bR

    b

    bb +=

    Rb is the nominal dynamic range (in bits) of subbandb, b andb are the number of bits allocated to the exponent and

    mantissa of subbands coefficients

    For error-free compression, b = 0,Rb = b, and b=1

    JPEG 2000 ( t)

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    93/106

    8-92CCU, Taiwan

    Wen-Nung Lie

    JPEG-2000 (cont) Quantized coefficients are arithmetically coded on a bit-plane basis

    The coefficients are arranged into rectangular blocks called code

    blocks, which are individually coded a bit plane at a time Starting from the MSB bit plane with nonzero elements to the LSB bit

    plane

    Encoding by using the context-based arithmetic coding method

    The coding outputs from each code block are grouped to formlayers

    The resulting layers are finally partitioned intopackets

    The encoder can encode Mb

    bitplanes for a particularsubband, the decoder can decode only Nbbitplanes, due tothe embedded nature of the code stream

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    94/106

    8-93CCU, Taiwan

    Wen-Nung Lie

    empty

    empty

    empty

    empty

    Layer 1

    Layer 2

    Layer 3

    Layer 4

    Layer 5

    code-block 1

    bit-stream code-block 2

    bit-stream

    code-block 3

    bit-stream code-block 4

    bit-stream

    code-block 5

    bit-stream code-block 6

    bit-stream

    code-block 7

    bit-stream

    N decomposition level N-1 decomposition levelN+1 decomposition level

    R-D / Quality

    Resolution

    Video compression -- motion-

    compensated transform coding

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    95/106

    8-94CCU, Taiwan

    Wen-Nung Lie

    compensated transform coding Standards

    Video teleconferencing standard (H.261, H.263, H.263+, H.320, H.264)

    Multimedia standard (MPEG-1/2/4)

    A combination of predictive coding (in temporal domain) and

    transform coding (in spatial domain)

    Estimate (predict) the image by simply propagating pixels in the

    previous frame along their motion trajectory motion

    compensation

    The success depends on accuracy, speed, and robustness of thedisplacement estimator -- motion estimation

    Motion estimation is based on each image block (1616 or 88

    pixels)

    Motion-compensated

    transform coding encoder

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    96/106

    8-95CCU, Taiwan

    Wen-Nung Lie

    transform coding - encoder

    Motion-compensated

    transform coding decoder

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    97/106

    8-96CCU, Taiwan

    Wen-Nung Lie

    transform coding -- decoder

    VLDVLDBUFBUF ITIT

    MCMC

    PredictorPredictor

    ReconstructedReconstructed

    FrameFrame

    Motion VectorsMotion Vectors

    ++

    Compressedbit stream

    IT : inverse DCT transform

    VLD : variable length decoding

    2-D Motion estimation

    techniques

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    98/106

    8-97CCU, Taiwan

    Wen-Nung Lie

    techniques Assumptions :

    Rigid translational movements only

    Uniform illumination in time and space

    Methods : Optical flow approach

    Pel-recursive approach

    Block-matching approach

    Different from motion estimation in target tracking

    Causal operations (forward prediction, no future frames at decoder) Goal: reproduce pictures with minimum distortion, not necessarily

    estimate accurate motion parameters

    Block-Matching (BM) Motion

    Estimation

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    99/106

    8-98CCU, Taiwan

    Wen-Nung Lie

    Estimation Find the best match between current image block and

    candidates in the previous frame by minimizing the following cost

    One motion vector (MV) for each block between successive frames

    frame k

    search window

    block

    frame k-1

    ),( yx mvmv

    ++blockyx

    yxnn mvymvxxyxx),(

    1 ),(),(

    BM motion estimation

    techniques

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    100/106

    8-99CCU, Taiwan

    Wen-Nung Lie

    techniques Full search method

    Fast Search (Reduced Search)

    Usually a multistage search procedure that calculates fewer search points

    Local-(or Sub-)optima in general

    Evaluation number of search points, noise immunity

    Existing Algorithms : Three-step method

    Four-step method

    Log-search method

    Hierarchical matching method

    Prediction search method

    Kalman prediction method

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    101/106

    Three steps search

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    102/106

    8-101CCU, Taiwan

    Wen-Nung Lie

    Three steps search The full search examines 1515=225 positions, the three step search examines

    9+8+8=25 positions (assume MV range is 7 to +7)

    Search range

    Predictive search

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    103/106

    8-102CCU, Taiwan

    Wen-Nung Lie

    Predictive search Explore the similarity of MVs between adjacent macroblocks

    Obtain initially predicted MV from neighboring macroblocks by

    averaging

    Prediction followed by a small-area detailed search

    Significantly reduce the CPU time by 4~10X

    Search area

    MV1 MV2

    MV3

    MVpredict=(MV1+MV2+MV3)/3

    Fractional pixel motion

    estimation

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    104/106

    8-103CCU, Taiwan

    Wen-Nung Lie

    estimation Question : what happens if motion vector are real numbers instead

    of integer numbers ?

    Answer : Find integer motion vector first and then do interpolation fromsurrounding pixels for refined search. Most popular case is the half-pixel

    accuracy (i.e., 1 out of 8 positions for refinement). 1/3 pixel accuracy will be

    proposed in H.263L standard.

    Half-pixel accuracy often leads to a matching with less residue (highreconstruction quality), hence decreases the bit rate required in following

    encoding.

    ),( yx mvmv

    : integer MV position

    : candidates of half-pixel MV positions

    Evaluation of BM motion

    estimation

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    105/106

    8-104CCU, Taiwan

    Wen-Nung Lie

    estimation Advantages :

    Straightforward, regular operation, and promising for VLSI chipdesign

    Robustness (no differential operations as in optical flow approach)

    Disadvantages :

    Can not process several moving objects in a block (e.g., around

    occlusion boundaries)

    Improvements :

    Post-processing of images to eliminate blocking effect

    Interpolation of images or averaging of motion vectors to subpixel

    accuracy

    Image Prediction structure

  • 8/10/2019 Lie Gonzalez Ch8 ppt

    106/106

    8-105CCU, Taiwan

    Wen-Nung Lie

    Image Prediction structure I/ P frame only (H.261, H.263)

    I/ P/ B frame (MPEG-x)

    . . . . .I P P P

    forward prediction

    P P

    . . . . .I B B B P B B B P B B B P B

    forward prediction

    bidirectional

    prediction

    Intra-frame

    Predictiveframe

    Bi-directional

    frame