Image compression

Image Compression

Done By:

Bassam Kanber

Khawla Al-Hashedy

Naglaa Fathi

Redha Qaid

Heba Al-Hakemy

Hesham Al-Aghbary

Supervisor

Ensaf Al-Zurqa

Goal of Image Compression

The goal of image compression is to reduce the

amount of data required to represent a digital

image.

Data ≠ Information Data and information are not synonymous terms!

Data is the means by which information is conveyed.

Data compression aims to reduce the amount of data required to represent a given quantity of information while preserving as much information as possible.

Data vs Information (cont’d) The same amount of information can be represented

by various amount of data.

Ex1:

Your wife, Helen, will meet you at Logan Airport in Boston at 5 minutes past 6:00 pm tomorrow night

Ex2:

Your wife will meet you at Logan Airport at 5 minutes past 6:00 pm tomorrow night

Ex3:

Helen will meet you at Logan at 6:00 pm tomorrow night

Definitions: Compression Ratio

compression

Compression ratio:

Definitions: Data Redundancy

Relative data redundancy:

Example:

Types of Data Redundancy(1) Coding Redundancy

(2) Interpixel Redundancy

(3) Psychovisual Redundancy

Compression attempts to reduce one or more of these redundancy types.

Coding Redundancy Code: a list of symbols (letters, numbers, bits etc.)

Code word: a sequence of symbols used to represent a piece of information or an event (e.g., gray levels).

Code word length: number of symbols in each code word

Coding Redundancy (cont’d)

N x M imagerk: k-th gray levelP(rk): probability of rk

l(rk): # of bits for rk

Expected value:

Coding Redundancy (con’d)

Case 1: l(rk) = constant length

Example:

Case 2: l(rk) = variable length

Interpixel redundancy Interpixel redundancy implies that pixel values are

correlated (i.e., a pixel value can be reasonably predicted by its neighbors).

autocorrelation: f(x)=g(x)

Interpixel redundancy (cont’d) To reduce interpixel redundancy, the data must be

transformed in another format (i.e., using a transformation)

e.g., thresholding, DFT, DWT, etc.

Example:

original

thresholded

Psychovisual redundancy

The human eye does not respond with equal sensitivity to all visual information.

It is more sensitive to the lower frequencies than to the higher frequencies in the visual spectrum.

Idea: discard data that is perceptually insignificant!

Psychovisual redundancy (cont’d)

256 gray levels 16 gray levels/random noise16 gray levels

C=8/4 = 2:1

i.e., add to each pixel asmall pseudo-random numberprior to quantization

Example: quantization

Fidelity Criteria

How close is to ?

Criteria

Subjective: based on human observers

Objective: mathematically defined criteria

Subjective Fidelity Criteria

Objective Fidelity Criteria

Root mean square error (RMS)

Mean-square signal-to-noise ratio (SNR)

Objective Fidelity Criteria (cont’d)

RMSE = 5.17 RMSE = 15.67 RMSE = 14.17

Image Compression Models

The image compression system is composed of 2 distinct structural blocks: an encoder & a decoder.

Encoder performs Compression Decoder performs Decompression.

The encoder is made up of a source encoder which removes

input redundancies, and a channel encoder, which increases the

noise immunity of the source encoder's output.

The de-coder includes a channel decoder followed by a source

decoder.

If the channel between the encoder and decoder is noise free (no

prone or error) the channel encoder and decoder are omitted.

Input image f(x,y) is fed into the encoder, which creates a

compressed representation of input.

It is stored for future for later use or transmitted for storage and

use at a remote location.

When the compressed image is given to decoder, a reconstructed

output image f’(x,…..) is generated.

The encoded input and decoder output are f(x, y) & f’(x, y) resp.

In video applications, they are f(x, y, t) & f’(x, y, t) where t is time.

1-The Source Encoder and Decoder::

The source encoder is responsible for reducing or eliminating any

coding, interpixel and psychovisual redundancies in the input

image.

Encoder is used to remove the redundancies through a series of 3

independent operations.

Mapper:It transforms f(x,y) into a format designed to reduce

interpixel redundancies.

It is reversible

It may / may not reduce the amount of data to represent image.

Ex. Run Length coding

Quantizer: reduces the accuracy of the mapper's output in

accordance with some pre-established fidelity criterion. This stage

reduces the psychovisual redundancies of the input image.

This operation is irreversible.

Encoder

Symbol Encoder: Generates a fixed or variable length

code to represent the quantizer output and maps the

output in accordance with the code.

• In most cases, a variable-length code is used to represent

the mapped and quantized data set. It assigns the shortest

code words to the most frequently occurring output values

and thus reduces coding redundancy.

• It is reversible.

• Upon its completion, the input image has been processed

for the removal of all 3 redundancies.

Encoder

Mapper: transforms input data in a way that facilitates reduction of interpixel redundancies.

Encoder

Quantizer: reduces the accuracy of the mapper’s output in accordance with some pre-established fidelity criteria.

Encoder

Symbol encoder: assigns the shortest code to the most frequently occurring output values.

Decoder

• Inverse operations are performed.

• But … quantization is irreversible in general.

• Quantization results in irreversible loss, an inverse quantizer block is not included in the decoder block.

Channel 2-The Channel Encoder and Decoder::

In the overall encoding-decoding process when the channel is noisy or prone to error They are designed to reduce the impact of channel noise by inserting a controlled form of redundancy into the source encoded data.

As the output of the source encoder contains little redundancy, it would be highly sensitive to transmission noise without the addition of this "controlled redundancy."

One of the most useful channel encoding techniques was devised by R. W.Hamming (Hamming [1950]).

It is based on appending enough bits to the data being encoded to ensure that some minimum number of bits must change between valid code words.

Hamming(7,4) is a linear error-correcting code that encodes 4 bits of data into 7 bits by adding 3 parity bits.

Hamming's (7,4) algorithm can correct any single-bit error, or detect all single-bit and two-bit errors.

Elements of Information Theory

Measuring Information

The generation of information is modeled as a probabilistic process. Random event E occurs with probability P(E)

I(E)=log= 1 =_ log (p(E))

P(E)

The base of the logarithm determines the units used to measure the information. If the base 2 is selected the resulting information unit is called bit. If P(E)=0.5 (two possible equally likely events) the information is one bit

I(E) is called the self-information of E.

The Information Channel

Information channel is the physical medium that connectsthe information source to the user of information.

Self-information is transferred between an information source and a user of the information, through the information channel.

Information source

Generates a random sequence of symbols from a finite or countably infinite set of possible symbols.

Output of the source is a discrete random variable

The source :

Modeled as a discrete random variable

Source alphabet A={aj}

Symbols (letters) aj with probabilities P(aj)

A simple information system

Output of the channel is also a discrete random variable which takes on values from a finite or countably infinite set of symbols {b1, b2,…, b K} called the channel alphabet B

Entropy

Conditional entropy function

units/pixel

Entropy EstimationIt is not easy to estimate H reliably!

image

EntropyFirst order estimate of H:

EntropySecond order estimate of H:

Use relative frequencies of pixel blocks

Estimating Entropy

The first-order estimate provides only a lower-bound on the compression that can be achieved.

Differences between higher-order estimates of entropy and the first-order estimate indicate the presence of interpixel redundancy!

Estimating Entropy

For example, consider differences:

16

Estimating Entropy

Entropy of difference image:

Better than before (i.e., H=1.81 for original image)

However, a better transformation could be found since:

Compression Types

Compression

Error-Free Compression

(Loss-less)

Lossy Compression

Error-Free Compression Some applications require no error in compression

(medical, business documents, etc..)

CR=2 to 10 can be expected.

Make use of coding redundancy and inter-pixel

redundancy.

Ex: Huffman codes, LZW, Arithmetic coding, 1D and 2D

run-length encoding, Loss-less Predictive Coding, and

Bit-Plane Coding.

Run-length encoding (RLE)

is a very simple form of data compression

stored as a single data value and count.

Ex: AAAABBCCCAA

Sol: 3A2B3C2A

Huffman Coding Huffman Coding

The most popular technique for removing coding redundancy is due to Huffman (1952)

A variable-length coding technique.

Optimal code (i.e., minimizes the number of code symbols per source symbol).

Huffman Coding yields the smallest number of code symbols per source symbol

Assumption: symbols are encoded one at a time!

Huffman Coding

Huffman Coding

bits

Lavg

2.2

)04.0(5)06.0(5)1.0(4)1.0(3)3.0(2)4.0(1

Arithmetic coding

5 symbol message, a1a2a3a3a4 from 4 symbol

source is coded.Source Symbol Probability Initial Subinterval

a1 0.2 [0.0, 0.2)

a2 0.2 [0.2, 0.4)

a3 0.4 [0.4, 0.8)

a4 0.2 [0.8, 1.0)

Arithmetic coding

The final message symbol narrows to [0.06752, 0.0688).

Any number between this interval can be used to represent the message.

E.g. 0.068

3 decimal digits are used to represent the 5 symbol message.

Fixed Length: LZW Coding

Error Free Compression Technique

Remove Inter-pixel redundancy

Requires no priori knowledge of probability

distribution of pixels

Assigns fixed length code words to variable length

sequences

Patented Algorithm US 4,558,302

Included in GIF and TIFF and PDF file formats

Coding Technique

A codebook or a dictionary has to be constructed

For an 8-bit monochrome image, the first 256 entries are assigned to the gray levels 0,1,2,..,255.

As the encoder examines image pixels, gray level sequences that are not in the dictionary are assigned to a new entry.

For instance sequence 255-255 can be assigned to entry 256, the address following the locations reserved for gray levels 0 to 255.

ExampleConsider the following 4 x 4 8 bit image

39 39 126 126

39 39 126 126

39 39 126 126

39 39 126 126

Dictionary Location Entry

0 0

1 1

. .

255 255

256 -

511 -

39 39 126 126

39 39 126 126

39 39 126 126

39 39 126 126

• Is 39 in the

dictionary……..Yes

• What about 39-

39………….No

• Then add 39-39 in entry 256

• And output the last

recognized symbol…39

Dictionary Location Entry

0 0

1 1

. .

255 255

256 39-39

511 -

Bit-Plane Coding

An effective technique to reduce inter pixel

redundancy is to process each bit plane

individually

The image is decomposed into a series of binary

images.

Each binary image is compressed using one of

well known binary compression techniques.

Bit-Plane Decomposition

Bit-Plane Encoding

one Dimensional Run Length coding

Two Dimensional Run Length coding

Loss-less Predictive Encoding

Decoding LZW

Use the dictionary for decoding the “encoded output” sequence.

The dictionary need not be sent with the encoded output.

Can be built on the “fly” by the decoder as it reads the received code words.

Run-length coding (RLC)(interpixel redundancy)

Encodes repeating string of symbols (i.e., runs) using a few bytes: (symbol, count)

1 1 1 1 1 0 0 0 0 0 0 1 (1,5) (0, 6) (1, 1)

a a a b b b b b b c c (a,3) (b, 6) (c, 2)

Can compress any type of data but cannot achieve high compression ratios compared to other compression methods.

Bit-plane coding (interpixel redundancy)

Process each bit plane individually.

(1) Decompose an image into a series of binary images.

(2) Compress each binary image (e.g., using run-length coding)

Combining Huffman Coding with Run-length Coding

Assuming that a message has been encoded using Huffman coding, additional compression can be achieved using run-length coding.

e.g., (0,1)(1,1)(0,1)(1,0)(0,2)(1,4)(0,2)

Lossy Methods - Taxonomy

See “Image Compression Techniques”, IEEE Potentials, February/March 2001

Lossy Compression Transform the image into a domain where

compression can be performed more efficiently (i.e., reduce interpixel redundancies).

~ (N/n)2 subimages

Transform Selection

T(u,v) can be computed using various transformations, for example:

DFT

DCT (Discrete Cosine Transform)

KLT (Karhunen-Loeve Transformation)

Example: Fourier Transform

The magnitude of the FT decreases, as u, v increase!

K << N

DCT (Discrete Cosine Transform)

Forward

Inverse

DCT (cont’d) Set of basis functions for a 4x4 image (i.e., cosines of

different frequencies).

DCT (cont’d)

RMS error: 2.32 1.78 1.13

Using8 x 8 subimages

64 coefficientsper subimage

50% of the coefficientstruncated

DCT (cont’d)

DCT reduces "blocking artifacts" (i.e., boundaries between subimages do not become very visible).

DCT (cont’d)

DCT reduces "blocking artifacts" (i.e., boundaries between subimages do not become very visible).

DFTi.e., n-point periodicitygives rise todiscontinuities!DCTi.e., 2n-point periodicityprevents discontinuities!

DCT (cont’d)

Subimages size selection:

image compression standard

ISO & CCITT

Binary

Continuous-tone

Video

Binary image compression standard Fax:

-Transmitting Docs. over Tele. Nets.

CCITT Group 3 (Huffman) 1D & 2D

CCITT Group 4 2D

JBIG 1

Continuous-tone still image compression standard JPEG

baseline, lossless, progressive and hierarchical

JPEG 2000

wavelet and sub-band technologies

Embedded Block Coding with Optimized

Truncation (EBCOT)

Features of JPEG2000

Lossless and lossy compression.

Protective image security.

Region-of-interest coding.

Robustness to bit errors.

JPEG-LS

latest ISO/ITU-T standard for lossless coding of still

images.

provides for “near-lossless” compression.

Part-I:

the baseline system, is based on:

adaptive prediction, context modeling and Golomb

coding.

Part-II (still under preparation).

Designed for low-complexity.

Video Compression Standards MPEG-1

The driving focus of the standard was storage of multimedia

content on a standard CDROM, which supported data

transfer rates of 1.4 Mb/s and a total storage capability of

about 600 MB.

MPEG-2

Designed to provide the capability for compressing, coding,

and transmitting high quality, multi-channel, multimedia

signals over broadband networks.

ex: ATM.

MPEG-4

Digital television, interactive graphics and the World Wide

Web.

Engineering

Image compression