60
D A T A C O M P R E S S I O N (DC) NAME: JAGARNATH PASWAN Email:[email protected]

Data Compression (Dc)

Embed Size (px)

Citation preview

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 1/60

DATA

COMP

RESSI

ON

(DC)NAME: JAGARNATH PASWANEmail:[email protected]

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 2/60

What is Data Compression

• Data Compression refers to the process of reducing

the amount of data required to represent a given

quantity of Information.

• In fact, Data are means by which information isconveyed.

• Data that either provide no relevant information or

simply restate that which is already known, is said to

contain data redundancy.

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 3/60

Why Data Compression

• In terms of communications, the bandwidth of adigital communication link can be effectively increased

by compressing data at the sending end and

decompressing data at the receiving end.

• In terms of storage, the capacity of a storage device

can be effectively increased with methods that

compresses a body of data on its way to a storagedevice and decompresses it when it is retrieved.

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 4/60

Compression Techniques Basis Techniques

Entropy Coding

Run-Length Coding

Huffman Coding

Arithmetic Coding

Liv-Zempel-Welch (LZW) Coding

Source Coding

Prediction DPCM

Transformation DCT

Layer Coding Subband Coding

Vector Quantization

Hybrid Coding

JPEG

MPEG

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 5/60

Data Compression MethodsLossless Compression

Text Entropy Sannon-fano, Huffman, Arithmetic

Dictionary LZ, LZW

Lossy Compression

Audio Audio Codec part DPCM, Sub-band

Image Method RLE, DCT

Video Video Codec part DCT, Vector Quantization

Hybrid Compression

Video Confrence Method JPEG, MPEG

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 6/60

Lossless Data Compression

1838Samuel Finley Breeze Morse

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 7/60

Information Theory

1948-Prof. Dr. Claude Elwood Shannon

“ A Mathematical Theory of Communication ”

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 8/60

Information Theory

Information is quantifiable as:Average Information = - log2 (prob. of occurrence)

For English: -log(1/26) = 4.7

Entropy (in our context) - smallest number of bitsneeded, on the average, to represent a symbol(the average on all the symbols code lengths).

Note: log2(pi ) is the uncertainty in symbol (or the 

“surprise” when we see this symbol). Entropy –

average “surprise”.

Assumption: there are no dependencies betweenthe symbols’ appearances

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 9/60

Shannon-Fano Data Compression

1949-Prof. Dr. Claude Elwood Shannon

-Prof. Dr. Robert Mario Fano

"The transmission of information". Technical Report No. 65 

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 10/60

Shannon-Fano Data Compression1. Line up the symbols

by falling probability

of incidence.2. Divide the symbols in

two groups, so thatboth groups haveequal or almost equalsum of theprobabilities.

3. Assign value 0 to thefirst group, and value 1to the second.

4. For each of the bothgroups go to step 2.

Symbol A B C D E

Count 15 7 6 6 5

Probabilities 0.38461538 0.17948718 0.15384615 0.15384615 0.12820513

Symbol A B C D E

Code0

0

0

1

1

0

11

0

11

1

H(s)=2Bit *(15+7+6)+3Bit * (6+5)

 / 39 symbols= 2.28 Bit PerSymbol

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 11/60

Huffman Data Compression

1952Dr. David Albert Huffman

“ A Method for the Construction of Minimum-Redundancy Codes ”

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 12/60

Huffman Data Compression

Symbol A B C D E

Count 15 7 6 6 5

Probabilities 0.38461538 0.17948718 0.15384615 0.15384615 0.12820513

Symbol A B C D E

Code 0

1

0

0

1

0

1

1

1

0

1

1

1

1. Line up the symbols by fallingprobabilities

2. Link two symbols with least

probabilities into one newsymbol which probability is asum of probabilities of twosymbols

3. Go to step 2. until yougenerate a single symbolwhich probability is 1

4. Trace the coding tree from aroot (the generated symbolwith probability 1) to originsymbols, and assign to eachlower branch 1, and to eachupper branch 0

1Bit * 15+3 Bit * (7+6+6+5)

 / 39 Symbols= 2.23 BPS

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 13/60

Arithmetic Coding

1976

- Prof. Peter Elias-Prof. Jorma Rissanen

-Prof. Richard Clark Pasco

"Generalized Kraft Inequality and Arithmetic Coding"

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 14/60

Arithmetic Coding

1.0

0.8

0.4

0.2

0.8

0.72

0.56

0.48

0.40.0

0.72

0.688

0.624

0.592

0.592

0.5856

0.5728

0.5664

0.5728

0.57152

056896

0.56768

0.56 0.56 0.5664

Source

Symbol

Probability Initial subInterval

a1 0.2 [0.0, 0.2]

a2 0.2 [0.2, 0.4]

a3 0.4 [0.4, 0.8]a4 0.2 [0.8, 1.0]

Let the message to be

encoded be a3a

3a

1a

2a

4

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 15/60

Arithmetic Coding

Therefore, the

message is

a3a3a1a2a4

Decoding:

Decode 0.572.

Since 0.8>code word > 0.4, the first symbol should be a3.

1.0

0.8

0.4

0.2

0.8

0.72

0.56

0.48

0.40.0

0.72

0.688

0.624

0.592

0.592

0.5856

0.5728

0.5664

0.5728

0.57152

056896

0.56768

0.56 0.56 0.5664

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 16/60

LZ Data Compression

1977-Prof. Abraham Lempel

-Prof. Dr. Jacob Ziv

“ A Universal Algorithm for Sequential Data Compression “

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 17/60

LZ Data Compression

Total no of bit = 23 After comptession = 13

23

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 18/60

ComparisonHuffman Arithmetic Lempel-Ziv

Probabilities Known in advance Known in advance Not known in advance

Alphabet Known in advance Known in advance Not known in advance

Data loss None None None

Symbols

dependency

Not used Not used Used – better

compression

Preprocessing Tree building None First pass on data

(can be eliminated)

Entropy If probabilities are

negative powers of 2

Very close Best results when

alphabet not known

Codewords One codeword for

each symbol

One codeword for

all data

Codewords for set

of alphabet

Intuition Intuitive Not intuitive Not intuitive

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 19/60

LZW Data Compression

“ A Technique for High Performance Data Compression ”

1984.-Prof. Abraham Lempel

-Prof. Dr. Jacob Ziv-Dr. Terry A. Welch

C l Pi l E d d Di i Di i

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 20/60

39 39 126 126

39 39 126 126

39 39 126 126

Currently

Recognized

Sequence

Pixel

Being

Processed

Encoded

Output

Dictionary

Location

(Code Word)

Dictionary

Entry

39

39 39 39 256 39-39

39 126 39 257 39-126

126 126 126 258 126-126

126 39 126 259 126-39

39 39

39-39 126 256 260 39-39-126

126 126

126-126 39 258 261 126-126-39

39 3939-39 126

39-39-126 126 260 262 39-39-126-126

126 eof 

Total No. Of bit =12Now Coded bit =7

LZWCompression

Algorithm

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 21/60

Rate-Distortion Theory

1948-Prof. Dr. Claude Elwood Shannon

“ A Mathematical Theory of Communication ”

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 22/60

Rate-Distortion Theory

 – Rate –distortion theory is a major

branch of information theory which

provides the theoretical foundations for

lossy data compression. it addresses

the problem of determining the

minimal amount of entropy (orinformation) R that should be

communicated over a channel, so that

the source (input signal) can be

approximately reconstructed at the

receiver (output signal) without

exceeding a given distortion D.

WhereR(D) = Rate Distortion FunctionH = Trade off rateD = Distortion

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 23/60

Distortion Measures

• A distortion measure is a mathematical quality that specifieshow close an approximation is to its original

 – The average pixel difference is given by the Mean Square

Error (MSE)

• The size of the error relative to the signal is given by the

signal-to-noise ratio (SNR)

•Another common measure is the peak-signal-to-noise ratio

(PSNR)

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 24/60

DPCM Data Compression“Differential Quantization of Communication Signals”

1950C.Chapin Cutler

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 25/60

DPCM Data Compression

Schematic DiagramAn Audio Signal

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 26/60

DPCM Data Compression

(fn - fn’) = e = 0 20 -2 56 63

fn = 130 150 140 200 230

fn’ = 130 130 142 144 167

e’ = 0 24 -8 56 56

e’= Q[en]= 16* trunc ((255+ en)/16) – 256 +8

fn”= 130 154 134 200 223

Prediction Error = fn – fn’

Reconstruction Error = Quantization Errorfn” – fn = e’ – e = q

fn’= (fn”+ fn-1”)/2 e.g. (154+134)/2=144

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 27/60

DCT Data Compression

“Discrete Cosine Transform”

1974

Dr. Nasir AhmedDr.T. Natarajan

Dr. Kamisetty R. Rao

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 28/60

DCT Data Compression

The most common DCT definition of a 1-D sequence of length N(8) is

for u = 0,1,2,…,N −1. C(u)=Transform Coefficient, f(x)= 1D Matrix Pixel value

Similarly, the inverse transformation is defined as

for x = 0,1,2,…,N −1. f(x)= 1D Matrix Pixel value, C(u)=Transform Coefficient

The One-Dimensional DCT

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 29/60

DCT Data Compression

The Two-Dimensional DCT

The 2-D DCT is a direct extension of the 1-D case and is given by

for x,y = 0,1,2,…,N −1.

for u,v = 0,1,2,…,N −1 .

The inverse transform is defined as

α( u) = α( v) =  u=0 or v=0

u>0 or v>0

u=0 or v=0

u>0 or v>0α( u) = α( v) = 

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 30/60

DCT Data Compression

The Matrix form of equestionOne dimensional cosine basis function

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 31/60

DCT Data Compression

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 32/60

DCT Data CompressionStep 1: Sample the Image into 8*8

Block

Step 2: The original image is Leveled off

by subreacting 128 from each entry.

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 33/60

DCT Data CompressionStep 3: DCT Transform by Matrix

Manipulation D=T M T’

Step 4: Now DCT Matrix is Divided

by Quantization Table.

162.3

= DC coefficient

DCT Transform MatrixQuantization Table Quality Level 50

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 34/60

DCT Data CompressionStep 5: Dividing D by Q and rounding

to nearest integer value.

Step 6: Now Zig- Zag Scan to

compress AC coefficent .

Quantized Matrix Zig- Zag Scan

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 35/60

DCT Data Compression

Decompression:

N=

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 36/60

DCT Data Compression

Compresion between Original and Decompressed image

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 37/60

DCT Data Compression

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 38/60

DCT Data Compression

Baboon Original Image DCT Baboon Decompress Image

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 39/60

JPEG (Joint Photographic Experts Group)

JPEG (pronounced jay-peg) is a most commonly used standard method of lossy

compression for photographic images.

JPEG itself specifies only how an image is transformed into a stream of bytes,but not how those bytes are encapsulated in any particular storage medium.

A further standard, created by the Independent JPEG Group, called JFIF (JPEGFile Interchange Format) specifies how to produce a file suitable for computerstorage and transmission from a JPEG stream.

In common usage, when one speaks of a "JPEG file" one generally means a JFIFfile, or sometimes an Exif JPEG file.

JPEG/JFIF is the format most used for storing and transmitting photographs on

the web.. It is not as well suited for line drawings and other textual or iconicgraphics because its compression method performs badly on these types of images

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 40/60

Baseline JPEG compression

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 41/60

Y = 0.299R + 0.587G + 0.114B 

U =Cb= 0.492(B − Y )= − 0.147R − 0.289G + 0.436B 

V =Cr= 0.877(R − Y )= 0.615R − 0.515G − 0.100B 

Y = luminanceCr, Cb = chrominance

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 42/60

JPEG File Interchange Format (JFIF)

The encoded data is written into the JPEG File Interchange Format (JFIF), which,

as the name suggests, is a simplified format allowing JPEG-compressed imagesto be shared across multiple platforms and applications.

JFIF includes embedded image and coding parameters, framed by appropriateheader information.

Specifically, aside from the encoded data, a JFIF file must store all coding andquantization tables that are necessary for the JPEG decoder to do its job properly.

MPEG D t C i

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 43/60

MPEG Data Compression

“Motion Picture Experts Group”

1980Motion Picture Experts Group (MPEG)

MPEG 1 D t C i

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 44/60

MPEG-1 Data Compression

“Motion Picture Experts Group”

MPEG 1 D t C i

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 45/60

MPEG-1 Data Compression“Motion Picture Experts Group”

MPEG 1 D t C i

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 46/60

MPEG-1 Data Compression“Motion Picture Experts Group”

MPEG 1 D t C i

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 47/60

MPEG-1 Data Compression“Motion Picture Experts Group”

MPEG 1 D t C i

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 48/60

MPEG-1 Data Compression“Motion Picture Experts Group”

MPEG 1 Data Compression

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 49/60

MPEG-1 Data Compression“Motion Picture Experts Group”

MPEG 1 Data Compression

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 50/60

MPEG-1 Data Compression“Motion Picture Experts Group”

MPEG 2 Data Compression

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 51/60

MPEG-2 Data Compression“Motion Picture Experts Group”

MPEG 2 Data Compression

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 52/60

MPEG-2 Data Compression“Motion Picture Experts Group”

MPEG 4 Data Compression

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 53/60

MPEG-4 Data Compression“Motion Picture Experts Group”

MPEG 4 Data Compression

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 54/60

MPEG-4 Data Compression“Motion Picture Experts Group”

MPEG 7 Data Compression

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 55/60

MPEG- 7 Data Compression“Motion Picture Experts Group”

H 261 Data Compression

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 56/60

H.261 Data Compression“Motion Picture Experts Group”

H 261 Data Compression

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 57/60

H.261 Data Compression“Motion Picture Experts Group”

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 58/60

Refrences

Multimedia Fundamentals Vol 1

-by Ralf Steinmetz and Klara Nahrstedt

http://en.wikipedia.org/wiki/Data_compression

http://navatrump.de/Technology/Datacompression/compression.html

Digital Image Processing 2nd Edition

-by Rafael C. Gonzalez and Richard E. Woods

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 59/60

8/8/2019 Data Compression (Dc)

http://slidepdf.com/reader/full/data-compression-dc 60/60