1. Introduction JPEG standard is a collaboration among : International Telecommunication Union (ITU)...
45
The JPEG Standard
1. Introduction JPEG standard is a collaboration among : International Telecommunication Union (ITU) International Organization for Standardization (ISO)
1. Introduction JPEG standard is a collaboration among :
International Telecommunication Union (ITU) International
Organization for Standardization (ISO) International
Electrotechnical Commission (IEC) The official names of JPEG :
Joint Photographic Experts Group ISO/IEC 10918-1 Digital
compression and coding of continuous-tone still image ITU-T
Recommendation T.81 Graduate Institute of Communication
Engineering, National Taiwan University, Taipei, Taiwan, ROC2
Slide 3
1. Introduction JPEG have the following mods : Lossless mode,
predictive coding Sequential mode, DCT-based coding Progressive
mode, DCT-based coding Hierarchical mode Baseline system Sequential
mode, DCT-based coding, Huffman coding for entropy encoding The
most widely used mode in practice Graduate Institute of
Communication Engineering, National Taiwan University, Taipei,
Taiwan, ROC3
Slide 4
General Image Storage System
Slide 5
The three most popular color models are RGB (used in computer
graphics); YIQ, YUV, or YCbCr (used in video systems); and CMYK
(used in color printing). All of the color spaces can be derived
from the RGB information supplied by devices such as cameras and
scanners. 5 RED Plane GREEN Plane BLUE Plane
Slide 6
RGB values for 100% amplitude, 100% saturated color bars, a
common video test signal. 6 The RGB color space is the most
prevalent choice for computer graphics because: color displays use
red, green, and blue to create the desired color. Therefore, the
choice of the RGB color space simplifies the architecture and
design of the system. Also, a system that is designed using the RGB
color space can take advantage of a large number of existing
software routines, since this color space has been around for a
number of years.
Slide 7
RGB Cons: However, RGB is not very efficient when dealing with
real-world images. All three RGB components need to be of equal
bandwidth to generate any color within the RGB color cube. (The
result of this is a frame buffer that has the same pixel depth and
display resolution for each RGB component). Also, processing an
image in the RGB color space is usually not the most efficient
method. For example, to modify the intensity or color of a given
pixel, the three RGB values must be read from the frame buffer, the
intensity or color calculated, the desired modifications performed,
and the new RGB values calculated and written back to the frame
buffer. If the system had access to an image stored directly in the
intensity and color format, some processing steps would be faster.
7
Slide 8
YUV Color Space In many applications, it is desirable to
describe a color in terms of its luminance and chrominance content
separately, to enable more efficient processing and transmission of
color signals One such coordinate is the YUV color space Y is the
components of luminance Cb and Cr are the components of chrominance
The values in the YUV coordinate are related to the values in the
RGB coordinate by
Slide 9
YUV Color Space The YUV color space is used by the PAL (Phase
Alternation Line), NTSC (National Television System Committee), and
SECAM (Sequentiel Couleur Avec Mmoire or Sequential Color with
Memory) composite color video standards. The black-and-white system
used only luma (Y) information; color information (U and V) was
added in such a way that a black-and-white receiver would still
display a normal black-and-white picture. Color receivers decoded
the additional color information to display a color picture. 9
Slide 10
Image Preparation Image Component/Plane: a gray-scale image
consist of a single component i.e. Intensity. An YUV image consist
of three planes; namely Luminance plane/Y plane Chrominance plane U
Chrominance plane V C 1 : Luminance Y C2: Chrominance U C3:
Chrominance V Each small rectangle represents a pixel
Slide 11
Color Specification Luminance Received brightness of the light,
which is proportional to the total energy in the visible band.
Chrominance Describe the perceived color tone of a light, which
depends on the wavelength composition of light Chrominance is in
turn characterized by two attributes Hue Specify the color tone,
which depends on the peak wavelength of the light Saturation
Describe how pure the color is, which depends on the spread or
bandwidth of the light spectrum
Slide 12
Luminance only Chrominance only full color image 12 Luma
represents the achromatic image, while the chroma components
represent the color information. Converting R'G'B' sources (such as
the output of camera) into luma and chroma allows for chroma
subsampling: because human vision has finer spatial sensitivity to
luminance ("black and white") differences than chromatic
differences, video systems can store chromatic information at lower
resolution, optimizing perceived detail at a particular
bandwidth.
Slide 13
2. Color Space Conversion Graduate Institute of Communication
Engineering, National Taiwan University, Taipei, Taiwan, ROC13 (a)
translate from RGB to YC b C r (b) translate from YC b C r to
RGB
Slide 14
The basic equations to convert between 8-bit digital RGB data
with a nominal range and YCbCr are: Y= 0.299R + 0.587G + 0.114B Cb
= 0.172R 0.339G + 0.511B + 128 Cr = 0.511R 0.428G 0.083B + 128 R =
Y+ 1.371(Cr 128) G = Y 0.698(Cr 128) 0.336(Cb 128) B = Y+ 1.732(Cb
128) 14
Slide 15
Data Unit/Block: Each block is made of 8x8 pixels. This
definition of block/data unit comes from DCT. These blocks of 8x8
pixels are transferred to the next step as a unit for processing.
There are two ways these data units are passed to the next step.
Non-interleaved data ordering: In this case, data units are passed
component by component. All the data units from first component are
passed and then from the second component and so on. Data units
from the each component are passed from left to right, top to
bottom order. Image Preparation
Slide 16
Non-interleaved data ordering: Using this mode for an
RGB-encoded image with very high resolution, the display would
initially present only the red component, then green and then blue
resulting the original image. Data Unit/Block To the Image
Processing Step Represen ts data unit of 8x8 pixels
Slide 17
Interleaved data ordering: Interleaved data ordering: In this
approach data units from different components are combined into
Minimum Coded Unit (MCU). If all components have the same
resolution, an MCU consists of exactly one data unit from each
component. Data Unit/Block MCU2MCU3MCU4MCU5MCU1 And so on To the
Image Processing Step
Slide 18
Interleaved data ordering: If all components dont have the same
resolution, Different number of data units from each component
comprises the MCU. The number of data units from each component is
calculated from relative horizontal and vertical sampling ratios.
Data units from each component are taken from left to right, top to
bottom order. MCUs are also constructed from left to right, top to
bottom order. Example 1: Let, three different plane of an image has
resolutions as follows: Plane 0: X 0 = 512, Y 0 = 256 Plane 1: X 1
= 256, Y 1 = 512 Plane 2: X 2 = 128, Y 2 = 256 Data Unit/Block
Slide 19
Example 1: Data Unit/Block .. 512 256 256 512 128 256 ... Plane
0(512x256 pixels) Plane1 (256x512 pixels) Plane 2(128x256 pixels)
Now, if we see data units of each component we find
******************** ******************** 64 32 ******** ********
******** ******** 32 64 16 32 ** ** Each * represents a data unit
i.e. 8x8 pixels
Slide 20
Data Unit/Block Example 1: Now H i, and V i are called relative
sampling ratios and calculated for each plane. H i = X i / X min V
i = Y i / Y min So, we get Plane 0: H 0 = 4, V 0 = 1 Plane 1: H 1 =
2, V 1 = 2 Plane 2: H 2 = 1, V 2 = 1 H i, and V i must be integer
values between 1 and 4. Now, a MCU is built by taking H 0 xV 0 data
units from plane 0 H 1 xV 1 units from plane 1 H 2 xV 2 units from
plane 2
Slide 21
Example 1: Data Unit/Block ****** * MCU Example 2: Let, an
image has four planes with the following dimensions Plane 0: X 0 =
48 pixels, Y 0 = 32 pixels Plane 1: X 1 = 48 pixels, Y 1 = 16
pixels Plane 2: X 2 = 24 pixels, Y 2 = 32 pixels Plane 3: X 3 = 24
pixels, Y 3 = 16 pixels
Slide 22
Data Unit/Block Example 2: Represents a data unit or block i.e.
8x8 pixels If we calculate H i and V i like previous example we
will find MCUs like below H 0 = 2, V 0 = 2 H 1 = 2, V 1 = 1 H 2 =
1, V 2 = 2 H 3 = 1, V 3 = 1 MCU1 MCU2 MCU3 MCU4 MCU5 MCU6 Blocks of
these MCUs are transferred to the next step.
Slide 23
3. Downsampling 23 Y W H CbCb W H CrCr W H Y W H Y W H CbCb W
CrCr W H H (a) 4:4:4(b) 4:2:2(c) 4:2:0 CbCb W H CrCr W H Figure 2.
Three color format in the baseline system 4:2:0 YCbCr Format Rather
than the horizontal-only 2:1 reduction of Cb and Cr used by 4:2:2,
4:2:0 YCbCr implements a 2:1 reduction of Cb and Cr in both the
vertical and horizontal directions. It is commonly used for video
compression.
Slide 24
Graduate Institute of Communication Engineering, National
Taiwan University, Taipei, Taiwan, ROC24 4:1:1 YCbCr Format For
every four horizontal Y samples, there is one Cb and Cr value. Each
component is typically 8 bits. Each sample therefore requires 12
bits 4:2:2 YCbCr Format For every two horizontal Y samples, there
is one Cb and Cr sample. Each sample is typically 8 bits (consumer
applications) or 10 bits (pro-video applications) per component.
Each sample therefore requires 16 bits (or 20 bits for provideo
applications ). 4:4:4 YCbCr Format Each sample has a Y, a Cb, and a
Cr value. Each sample is typically 8 bits (consumer applications)
or 10 bits (pro-video applications) per component. Each sample
therefore requires 24 bits (or 30 bits for pro-video
applications).
Slide 25
4. Discrete Cosine Transform A technique for converting signal
into elementary frequency components ( transforming an image from
spatial to frequency domain. Transforming a 8x8 pixel block through
FDCT we find 64 coefficients which can be regarded as a
two-dimensional frequency ) Coding Predictive Coding - In
predictive coding, information already sent or available is used to
predict future values, and the difference is coded. Transform
Coding - Transforms the image from its spatial domain
representation to a different type of representation using some
well known transform and then codes the transformed values
(coefficients) Transform coding relies on the premise that pixels
in an image exhibit a certain level of correlation with their
neighboring pixels Consequently, these correlations can be
exploited to predict the value of a pixel from its respective
neighbors.
4. Discrete Cosine Transform 27 0 1 2 3 4 5 6 7 u v The 8x8 DCT
basis FDCT: The function of the formula is called basis function.
The 64 basis functions can be illustrated by the following image.
Steps of FDCT: At the beginning of FDCT, the pixel values are
shifted into the range [-128, 127], with zero as the center.
Slide 28
4. Discrete Cosine Transform 28 Example : Y the luminance of an
image W H 8x8 values of luminance 8x8 coefficiences DCT
Slide 29
Steps of FDCT: Example 3: And then taking DCT and rounding to
the nearest integer results Image Processing
Slide 30
Lossy Sequential DCT-based Mode Step: Quantization The human
eye is fairly good at seeing small differences in brightness over a
relatively large area. But not so good at distinguishing the exact
strength of a high frequency brightness variation. This fact allows
reducing the amount of information in the high frequency
components. This is done by simply dividing each coefficient by a
constant for that component, and then rounding to the nearest
integer.
Slide 31
Quantization This is the main lossy operation in the whole
process. As a result of this, it is typically the case that many of
the higher frequency components are rounded to zero. A common
quantization matrix is(i.e. the numbers by which the coefficients
are divided)
Slide 32
5. Quantization Graduate Institute of Communication
Engineering, National Taiwan University, Taipei, Taiwan, ROC 32
Example : F(u,v) 8x8 DCT coefficiences 1611101624405161 12
141926586055 1413162440576956 1417222951878062 182237566810910377
243555648110411392 49647887103121120101 7292959811210010399 Q(u,v)
Quantization matrix
Slide 33
Using this quantization matrix with the DCT coefficient matrix
of example 3 results in Quantization Here we see, most of the high
frequency components are zero. This matrix is sent to the next step
for entropy encoding.
Slide 34
Lossy Sequential DCT-based Mode Entropy Encoding
Slide 35
Encoding of DC-coefficient: During the initial step of entropy
encoding, a DC-coefficient is encoded as the difference between the
current and the previous one. Only the differences are subsequently
processed. Lossy Sequential DCT-based Mode Step: Entropy Encoding
Block i-1 Block i DC i-1 DC i Diff = DC i DC i-1
Slide 36
Encoding of DC-coefficient: The DC coefficients are usually
highly correlated, this reduces the entropy of the DC data stream.
The result is a series of numbers. Example 4: This may result
something like 1,2,-1,0,2,3,-3,..., one number for each block in
the image, which is to be compressed with a lossless entropy
encoding method. Now these numbers are encoded according to the
following table. -8-7-6-5-4-3-201234567 00 1 1 2-3-223
3-7-6-5-44567 4-15-14-13-12-11-10-9-889101112131415 5 Entropy
Encoding
Slide 37
1 is at (1, 0) in binary (0001,00000000000) represented as
(0001, 0) 3 is at (2, 1) in binary (0010,00000000001) represented
as (0010, 01) -3 is at (2,-2) in binary (0010,11111111110)
represented as (0010, 10) Under the baseline mode the first number(
i.e. 0001 of (0001,0) ) is additionally compressed using either a
default Huffman table, or optionally one provided with the image.
Encoding of DC-coefficient: Entropy Encoding
Slide 38
Zig-Zag ordering of AC-coefficients: Zig-Zag ordering of
AC-coefficients: The AC-coefficients are taken in the order shown
below. The Zig-Zag sequence actually collects the low frequency
coefficients before high frequency coefficients, thereby grouping
large numbers at the beginning of the sequence. The Zig-Zag
sequence for the example 3 after quantization will be 3, 0, 3, 2,
6, 2, 4, 1, 4, 1, 1, 5, 1, 2, 1, 1, 1, 2, 0, 0, 0, 0, 0, 1, 1, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 Encoding of
AC-coefficient:
Slide 39
6. Zig-Zag Reordering 39 44465100 -11-6-5-42000 62420000
-31-21000 -9-2000000 401 000 001000 0 000000 Example : Zig-Zag
Reordering : 44, 4,-11, 6,-6,6, 5,-5,2,-3, -9,1,4,-4,1,
-1,2,2,-1,-2,4, -1,0,0,-2,0,0,0, 0,0,0,1,0,1,-1,0, -1,0,-1,0,0,0,0,
0,0,0,-1,0,0, 0,1,0,0,0, 0,0,0,0, 0,0,0, 0,0, 0 Issue The
quantization generates error by its nature. Therefore, the fault
tolerance systems need to distinguish between the quantization
errors and the computer failure errors. This is one of the
challenges in the research project To improve the compression
ratio, the quantized block is rearranged into the zigzag order then
applied the run length coding method to convert the sequence into
the intermediate symbols
Slide 40
9. Huffman Coding 40 CategoryValuesBits for the value 1-1,10,1
2-3,-2,2,300,01,10,11
3-7,-6,-5,-4,4,5,6,7000,001,010,011,100,101,110,111
4-15,...,-8,8,...,150000,...,0111,1000,...,1111
5-31,...,-16,16,...3100000,...,01111,10000,...,11111
6-63,...,-32,32,...63000000,...,011111,100000,...,111111
7-127,...,-64,64,...,1270000000,...,0111111,1000000,...,1111111
8-255,..,-128,128,..,255... 9-511,..,-256,256,..,511...
10-1023,..,-512,512,..,1023... 11-2047,..,-1024,1024,..,2047...
Figure 6. Table of values and bits for the value
Slide 41
9. Huffman Coding Graduate Institute of Communication
Engineering, National Taiwan University, Taipei, Taiwan, ROC41
Example Run lenth coding of 63 AC coefficiences : (0,57) ; (0,45) ;
(4,23) ; (1,-30) ; (0,-8) ; (2,1) ; (0,0) Encode the right value of
these pair as category and bits for the value, except the special
markers like (0,0) or (15,0) : (0,6,111001) ; (0,6,101101) ;
(4,5,10111); (1,5,00001) ; (0,4,0111) ; (2,1,1) ; (0,0) The
difference of DC coefficience : -511 Encode the value as category
and bits for the value : 9, 000000000
Slide 42
9. Huffman Coding Graduate Institute of Communication
Engineering, National Taiwan University, Taipei, Taiwan, ROC42
run/categorycode lengthcode word 0/041010... 0/671111000...
0/10161111111110000011 1/141100... 1/51111111110110...
4/5161111111110011000... 15/10161111111111111110 Figure 7. Huffman
table of luminance AC coefficience
Slide 43
9. Huffman Coding 43 categorycode lengthcode word 0200 13010
23011 33100 43101 53110 641110 7511110 86111110 971111110
10811111110 119111111110 Figure 8. Huffman table of luminance DC
coefficience
Slide 44
9. Huffman Coding Graduate Institute of Communication
Engineering, National Taiwan University, Taipei, Taiwan, ROC44
Example The AC coefficiences : (0,6,111001) ; (0,6,101101) ;
(4,5,10111); (1,5,00001) ; (0,4,0111) ; (2,1,1) ; (0,0) Encode the
left two value in () using Huffman encoding : 1111000 111001,
1111000 101101, 1111111110011000 10111, 11111110110 00001, 1011
0111, 11100 1, 1010 The DC coefficience : 9, 000000000 Encode the
category using Huffman encoding : 1111110 000000000
Slide 45
End Graduate Institute of Communication Engineering, National
Taiwan University, Taipei, Taiwan, ROC45