20
STATISTIC & INFORMATION THEORY (CSNB134) MODULE 8 INTRODUCTION TO INFORMATION THEORY

STATISTIC & INFORMATION THEORY (CSNB134) MODULE 8 INTRODUCTION TO INFORMATION THEORY

Embed Size (px)

Citation preview

Page 1: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 8 INTRODUCTION TO INFORMATION THEORY

STATISTIC & INFORMATION THEORY

(CSNB134)

MODULE 8INTRODUCTION TO INFORMATION THEORY

Page 2: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 8 INTRODUCTION TO INFORMATION THEORY

Recaps..

In the Course Overview, it has been highlighted that this course is divided into two main parts:-(1) understanding fundamental statistics

(2) understanding basic information theory In the first seven modules (i.e. Module 1, Module

2, ……., Module 7) we have covered the first part of understanding fundamental statistics.

Thus, the remaining modules is about understanding basic information theory.

In this module, students will be taught on the introduction of Information Theory

applied in

Page 3: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 8 INTRODUCTION TO INFORMATION THEORY

Model of Information Theory

The most basic model of Information Theory is:

where information is generated at the source,send through a channel, and consumed in the drain.

Often information needs to go through several processes of coding before it can be send through the channel (i.e. information is coded through several processes before it is transmitted through the channel).

The ‘coded’ information then needs to be decoded at the receiving-end in order to convert it back to its original form, as such the receiver can read / view its content.

Page 4: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 8 INTRODUCTION TO INFORMATION THEORY

Model of Information Theory (cont.) Thus, the previous basic model of Information Theory

can be expanded further as follows:

Source coding is the coding mechanism that transform information to its appropriate format of representation (e.g. as a text document, JPEG image, MP3 audio etc.)

Channel coding is the coding mechanism that transform the information into its appropriate format for transportation which suits the capacity of the channel

Page 5: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 8 INTRODUCTION TO INFORMATION THEORY

Basic Form of Digital Information

In the digital world, information is represented in binary digits, which is know as bits.

A bit can either be a ‘1’ or a ‘0’, which is of base 2.

Since information is often represented by huge number of bits, thus we often quote them in terms of bytes, where:

bitsbitskbits 102421 10 bitsbitskbits 102421 10

bytesbytesGbytes

bytebytesMbytes

bytesbyteskbytes

bitsbyte

107374182421

104857621

102421

81

30

20

10

bytesbytesGbytes

bytebytesMbytes

bytesbyteskbytes

bitsbyte

107374182421

104857621

102421

81

30

20

10

Page 6: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 8 INTRODUCTION TO INFORMATION THEORY

Channel Capacity

We mentioned previously, that information is generated at the source, sent through the channel and consumed in the drain.

The analogy of channel capacity is similar to the example of a pipe channel water to fill up the basin.

The time takes to fill up the basin depends very much on the diameter of the pipe.

Similarly, a channel with more capacity can transmit more information from the source to the drain within a specified period of time.

Page 7: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 8 INTRODUCTION TO INFORMATION THEORY

Channel Capacity (cont.)

If a channel has a capacity to transmit N bits/s, and M bits have to be sent, it takes M/N seconds for the message to get through.

If the message is continuous ('Streaming Audio' or 'Streaming Video'), the channel capacity must be at least as large as the data rate (bits/s).

How long does it take to send an e-mail of 12.4 kByte across a channel of 56 kbit/s?(12.4*8kbit) / (56 kbit/s) = 1.8s

When a channel can take 8 Mbyte/s; how many audio signals of 128 kbit/s can it carry?(8*1024*8kbit/s) / (128kbit/s) = 512

Page 8: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 8 INTRODUCTION TO INFORMATION THEORY

Code Representation

Previously, we learned that the basic form of digital information is in bits.

If we have 2 bits, we can derive the following table:

which implies that we can represent 4 symbols with 2 bits.

Symbol Bit1 Bit0

S0 0 0

S1 0 1

S2 1 0

S3 1 1

Page 9: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 8 INTRODUCTION TO INFORMATION THEORY

Code Representation (cont.)

How many symbols we can represent with 3 bits?

We can represent 2n symbols with n number of bits

Symbol

Bit2 Bit1 Bit0

S0 0 0 0

S1 0 0 1

S2 0 1 0

S3 0 1 1

S4 1 0 0

S5 1 0 1

S6 1 1 0

S7 1 1 1

We can represent 8 symbols with 3 bits

Page 10: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 8 INTRODUCTION TO INFORMATION THEORY

Code Representation (cont.)

How many bits do we need to transmit 128 different symbols? How many bits needed to represent 100 symbols?

We can represent 2n symbols with n number of bits

Thus, 27 = 128, n = 7 bits to transmit both 128 and 100 symbols

The answer to the last question shows that sometimes we can transmit more symbols with the given number of bits.

Page 11: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 8 INTRODUCTION TO INFORMATION THEORY

Code Representation (cont.)

Next we shall learn about several types of binary code representation which include:(i) Binary Code(ii) BCD Code(iii) Hex Code(iv) ASCII Code

Page 12: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 8 INTRODUCTION TO INFORMATION THEORY

Binary Code

Binary code is a straight forward transformation of numbers into its equivalent binary format.

For example 4 and 7 is represented as ‘100’ and ‘111’ in their binary code formats.

What is the binary representation of 42819?

Integer 215 214 213 212 211 210 29 28 27 26 25 24 23 22 21 20

Binary 1 0 1 0 0 1 1 1 0 1 0 0 0 0 1 1

42819 = 215 + 213 + 210 + 29 + 28 + 26 + 21 + 20

= 32768 + 8192 + 1024 + 512 + 256 + 64 + 2 + 1

= 42819

Page 13: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 8 INTRODUCTION TO INFORMATION THEORY

BCD (Binary Coded Decimal) Code

In BCD code, each digit of a number is independently converted into its own binary representation.

For example, the BCD representation of integer 42819 is:

BCD is sub-optimal, it requires 20 bits to represent 42819 as compared to 16 bits of binary code!

Integer 4 2 8 1 9

Binary 0100 0010 0100 0001 1001

Note: each digit is independently converted into binary digit

Note: 4 bits are needed to sufficiently represent number 0 to 9

Page 14: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 8 INTRODUCTION TO INFORMATION THEORY

BCD (Binary Coded Decimal) Code (cont.) Whichever number

represented in BCD, only 10 out of 16 symbols are not used.

This results is 37.5% ‘wastage’ (i.e. 6/16*100%)

Symbol Bit3 Bit2 Bit1 Bit0

Num 0 0 0 0 0

Num 1 0 0 0 1

Num 2 0 0 1 0

Num 3 0 0 1 1

Num 4 0 1 0 0

Num 5 0 1 0 1

Num 6 0 1 1 0

Num 7 0 1 1 1

Num 8 1 0 0 0

Num 9 1 0 0 1

Unused 1 0 1 0

Unused 1 0 1 1

Unused 1 1 0 0

Unused 1 1 0 1

Unused 1 1 1 0

Unused 1 1 1 1

Page 15: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 8 INTRODUCTION TO INFORMATION THEORY

Hexadecimal Code

The main limitation of BCD is due to the fact that we are converting decimal numbers which only consists of 10 symbols (i.e. from 0 to 9 or also known as based-10) to its binary equivalent of 4 bits, resulting in wastage of 6 symbols.

This limitation is overcome by hexadecimal code which is based on hexadecimal numbers that consists of 16 symbols (i.e. from 0 to 15 also known as based 16).

Thus, 16 symbols by using 4 bits can be fully utilized!

Page 16: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 8 INTRODUCTION TO INFORMATION THEORY

Hexadecimal Code (cont.)

Decimal

Hexadecimal

Bit3 Bit2 Bit1 Bit0

0 0 0 0 0 0

1 1 0 0 0 1

2 2 0 0 1 0

3 3 0 0 1 1

4 4 0 1 0 0

5 5 0 1 0 1

6 6 0 1 1 0

7 7 0 1 1 1

8 8 1 0 0 0

9 9 1 0 0 1

10 A 1 0 1 0

11 B 1 0 1 1

12 C 1 1 0 0

13 D 1 1 0 1

14 E 1 1 1 0

15 F 1 1 1 1

Previously we have learned that there are 8 bits in a byte.

Thus, we can represent 2 hexadecimal digits in a byte (i.e. 1 hexadecimal digit require 4 bits)

11111111 = FF in hex = (15 * 161) + (15 *160) decimal = 255 decimal

Page 17: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 8 INTRODUCTION TO INFORMATION THEORY

ASCII Code

ASCII (American Standard Code for Information Interchange), is a character encoding based on the American alphabet.

Here 7 bits out of a byte is considered sufficient to represent the following 95 printable symbols, in addition to another 33 control characters (e.g. DELETE, <SPACE> etc.)

The last bit is used as a parity bit which provides for a single bit error detection capability.

Page 18: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 8 INTRODUCTION TO INFORMATION THEORY

ASCII Code (cont.)

Page 19: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 8 INTRODUCTION TO INFORMATION THEORY

Comparisons of Code Representation A decimal number of 54276 can be

represented as:Code Equivalent Representation

Decimal 54276

BCD 01010100001001110110

Binary 1101010000000100

Hex D404

Hex (Binary) 1101010000000100

ASCII (Hex) 3534323736

ASCII (Dec) 5352505554

ASCII (Binary)

01101010110100011001001101110110110

Note: ASCII needs more data for transmission, but is valuable because it can represent many more symbols!

Page 20: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 8 INTRODUCTION TO INFORMATION THEORY

STATISTIC & INFORMATION THEORY

(CSNB134)

INTRODUCTION TO INFORMATION THEORY--END--