186
1/8 Digital Communications III (ECE 154C) Introduction to Coding and Information Theory Tara Javidi These lecture notes were originally developed by late Prof. J. K. Wolf. UC San Diego Spring 2014

Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

1 / 8

Digital Communications III (ECE 154C)

Introduction to Coding and Information Theory

Tara Javidi

These lecture notes were originally developed by late Prof. J. K. Wolf.

UC San Diego

Spring 2014

Page 2: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Overview of ECE 154C

Course Overview

• Course Overview I

• Overview II

Examples

2 / 8

Page 3: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Course Overview I: Digital Communications Block Diagram

Course Overview

• Course Overview I

• Overview II

Examples

3 / 8

Page 4: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Course Overview I: Digital Communications Block Diagram

Course Overview

• Course Overview I

• Overview II

Examples

3 / 8

• Note that the Source Encoder converts all types of information to

a stream of binary digits.

Page 5: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Course Overview I: Digital Communications Block Diagram

Course Overview

• Course Overview I

• Overview II

Examples

3 / 8

• Note that the Source Encoder converts all types of information to

a stream of binary digits.

• Note that the Channel Endcouter, in an attempt to protect the

source coded (binary) stream, judiciously adds redundant bits.

Page 6: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Course Overview I: Digital Communications Block Diagram

Course Overview

• Course Overview I

• Overview II

Examples

3 / 8

• Sometimes the output of the source decoder must be an exact

{replica of the information (e.g. computer data) — called

NOISELESS CODING (aka lossless compression)

Page 7: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Course Overview I: Digital Communications Block Diagram

Course Overview

• Course Overview I

• Overview II

Examples

3 / 8

• Sometimes the output of the source decoder must be an exact

{replica of the information (e.g. computer data) — called

NOISELESS CODING (aka lossless compression)

• Other times the output of the source decoder can be

approximately equal to the information (e.g. music, tv, speech) —

called CODING WITH DISTORTION (aka lossy compression)

Page 8: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Overview II: What will we cover?

Course Overview

• Course Overview I

• Overview II

Examples

4 / 8

REFERENCE: CHAPTER 10 ZIEMER & TRANTER

SOURCE CODING - NOISELESS CODES

◦ Basic idea is to use as few binary digits as possible and still

be able to recover the information exactly

◦ Topics include:

• Huffman Codes

• Shannon Fano Codes

• Tunstall Codes

• Entropy of Source

• Lempel-Ziv Codes

Page 9: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Overview II: What will we cover?

Course Overview

• Course Overview I

• Overview II

Examples

4 / 8

REFERENCE: CHAPTER 10 ZIEMER & TRANTER

SOURCE CODING WITH DISTORTION

◦ Again the idea is to use minimum number of binary digits for a

given value of distortion

◦ Topics include:

• Gaussian Source

• Optimal Quantizing

Page 10: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Overview II: What will we cover?

Course Overview

• Course Overview I

• Overview II

Examples

4 / 8

REFERENCE: CHAPTER 10 ZIEMER & TRANTER

CHANNEL CAPACITY OF A NOISY CHANNEL

◦ Even if channel is noisy, messages can be sent essentially

error free if extra digits are transmitted

◦ Basic idea is to use as few extra digits as possible

◦ Topics Covered:

• Channel Capacity

• Mutual Information

• Some Examples

Page 11: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Overview II: What will we cover?

Course Overview

• Course Overview I

• Overview II

Examples

4 / 8

REFERENCE: CHAPTER 10 ZIEMER & TRANTER

CHANNEL CODING

◦ Basic idea — Detect errors that occured on channel and then

correct them

◦ Topics Covered:

• Hamming Code

• General Theory of Block Codes

(Parity Check Matrix, Generator Matrix, Minimum

Distance, etc.)

• LDPC Codes

• Turbo Codes

• Code Performance

Page 12: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

A Few Examples

Course Overview

Examples

• Example 1

• Example 2

• More Examples

5 / 8

Page 13: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Example 1: 4 letter DMS

Course Overview

Examples

• Example 1

• Example 2

• More Examples

6 / 8

Basic concepts came from one paper of one man named

Claude Shannon!

Page 14: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Example 1: 4 letter DMS

Course Overview

Examples

• Example 1

• Example 2

• More Examples

6 / 8

Basic concepts came from one paper of one man named

Claude Shannon! Shannon used simple models that

capture the essence of the problem!

Page 15: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Example 1: 4 letter DMS

Course Overview

Examples

• Example 1

• Example 2

• More Examples

6 / 8

EXAMPLE 1– Simple Model of a source (Called a DISCRETE

MEMORYLESS SOURCE OR DMS)

Page 16: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Example 1: 4 letter DMS

Course Overview

Examples

• Example 1

• Example 2

• More Examples

6 / 8

EXAMPLE 1– Simple Model of a source (Called a DISCRETE

MEMORYLESS SOURCE OR DMS)

• I.I.D. (Independent and Identically Distributed) source letters

Page 17: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Example 1: 4 letter DMS

Course Overview

Examples

• Example 1

• Example 2

• More Examples

6 / 8

EXAMPLE 1– Simple Model of a source (Called a DISCRETE

MEMORYLESS SOURCE OR DMS)

• I.I.D. (Independent and Identically Distributed) source letters

• Alphabet size of 4 (A,B,C,D)

Page 18: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Example 1: 4 letter DMS

Course Overview

Examples

• Example 1

• Example 2

• More Examples

6 / 8

EXAMPLE 1– Simple Model of a source (Called a DISCRETE

MEMORYLESS SOURCE OR DMS)

• I.I.D. (Independent and Identically Distributed) source letters

• Alphabet size of 4 (A,B,C,D)

• P(A) = p1, P(B) = p2, P(C) = p3, P(D) = p4,∑

ipi = 1

Page 19: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Example 1: 4 letter DMS

Course Overview

Examples

• Example 1

• Example 2

• More Examples

6 / 8

EXAMPLE 1– Simple Model of a source (Called a DISCRETE

MEMORYLESS SOURCE OR DMS)

• I.I.D. (Independent and Identically Distributed) source letters

• Alphabet size of 4 (A,B,C,D)

• P(A) = p1, P(B) = p2, P(C) = p3, P(D) = p4,∑

ipi = 1

• Simplest CodeA −→ 00B −→ 01C −→ 10D −→ 11

Page 20: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Example 1: 4 letter DMS

Course Overview

Examples

• Example 1

• Example 2

• More Examples

6 / 8

EXAMPLE 1– Simple Model of a source (Called a DISCRETE

MEMORYLESS SOURCE OR DMS)

• I.I.D. (Independent and Identically Distributed) source letters

• Alphabet size of 4 (A,B,C,D)

• P(A) = p1, P(B) = p2, P(C) = p3, P(D) = p4,∑

ipi = 1

• Simplest CodeA −→ 00B −→ 01C −→ 10D −→ 11

Page 21: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Example 1: 4 letter DMS

Course Overview

Examples

• Example 1

• Example 2

• More Examples

6 / 8

Page 22: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Example 1: 4 letter DMS

Course Overview

Examples

• Example 1

• Example 2

• More Examples

6 / 8

• Average length of code words

L = 2(p1 + p2 + p3 + p4) = 2

Page 23: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Example 1: 4 letter DMS

Course Overview

Examples

• Example 1

• Example 2

• More Examples

6 / 8

• Average length of code words

L = 2(p1 + p2 + p3 + p4) = 2

Q: Can we use fewer than 2 binary digits per source letter (on the

average) and still recover information from the binary sequence?

Page 24: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Example 1: 4 letter DMS

Course Overview

Examples

• Example 1

• Example 2

• More Examples

6 / 8

• Average length of code words

L = 2(p1 + p2 + p3 + p4) = 2

Q: Can we use fewer than 2 binary digits per source letter (on the

average) and still recover information from the binary sequence?

A: Depends on values of (p1, p2, p3, p4)

Page 25: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Example 2: Binary Symmetric Channel

Course Overview

Examples

• Example 1

• Example 2

• More Examples

7 / 8

EXAMPLE 2– Simple Model for Noisy Channel

Page 26: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Example 2: Binary Symmetric Channel

Course Overview

Examples

• Example 1

• Example 2

• More Examples

7 / 8

EXAMPLE 2– Simple Model for Noisy Channel

Channels, as you saw in ECE154B, can be viewed as

If s0(t) = −s1(t) and equally likely signals,

Perror = Q

(

2E

N0

)

= P

Page 27: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Example 2: Binary Symmetric Channel

Course Overview

Examples

• Example 1

• Example 2

• More Examples

7 / 8

EXAMPLE 2– Simple Model for Noisy Channel

Channels, as you saw in ECE154B, can be viewed as

If s0(t) = −s1(t) and equally likely signals,

Perror = Q

(

2E

N0

)

= P

Q: Can we send information “error-free” over such a channel even

though p 6= 0, 1?

Page 28: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Example 2: Binary Symmetric Channel

Course Overview

Examples

• Example 1

• Example 2

• More Examples

7 / 8

EXAMPLE 2– Simple Model for Noisy Channel

Shannon considered a simpler channel called binary symmetric

channel (or BSC for short)

Pictorially Mathematically

PY |X(y|x) =

{

1− p y = x

p y 6= x

Q: Can we send information “error-free” over such a channel even

though p 6= 0, 1?

Page 29: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Example 2: Binary Symmetric Channel

Course Overview

Examples

• Example 1

• Example 2

• More Examples

7 / 8

EXAMPLE 2– Simple Model for Noisy Channel

Shannon considered a simpler channel called binary symmetric

channel (or BSC for short)

Pictorially Mathematically

PY |X(y|x) =

{

1− p y = x

p y 6= x

Q: Can we send information “error-free” over such a channel even

though p 6= 0, 1?

A: Depends on the rate of transmission (how many channel uses

are allowed per information bit). Essentially for small enough of

transmission rate (to be defined precisely), the answer is YES!

Page 30: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Example 3: DMS with Alphabet size 8

Course Overview

Examples

• Example 1

• Example 2

• More Examples

8 / 8

Page 31: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Example 3: DMS with Alphabet size 8

Course Overview

Examples

• Example 1

• Example 2

• More Examples

8 / 8

EXAMPLE 3– Discrete Memoryless Source with alphabet size of 8

letters: {A,B,C,D,E, F,G,H}

• Probabilities: {pA ≥ pB ≥ pC ≥ pD ≥ pE ≥ pF ≥ pG, pH}• See the following codes:

Q: Which codes are uniquely decodable? Which ones are

instantaneously decodable? Compute the average length of the

codewords for each code.

Page 32: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Example 3: DMS with Alphabet size 8

Course Overview

Examples

• Example 1

• Example 2

• More Examples

8 / 8

EXAMPLE 4– Can you optimally design a code?

L =1

2× 1 +

1

4× 2 +

1

8× 3 +

1

16× 4 +

4

64× 6

=1

32+

1

32+

1

16+

1

8+

1

4+

1

2+ 1 = 2

We will see that this is an optimal code (not only among the

single-letter constructions but overall).

Page 33: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Example 3: DMS with Alphabet size 8

Course Overview

Examples

• Example 1

• Example 2

• More Examples

8 / 8

EXAMPLE 5–

L = .1 + .1 + .2 + .2 + .3 + .5 + 1 = 2.4

But here we can do better by encoding 2 source letters (or more) at a

time?

Page 34: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

1 / 16

Digital Communications III (ECE 154C)

Introduction to Coding and Information Theory

Tara Javidi

These lecture notes were originally developed by late Prof. J. K. Wolf.

UC San Diego

Spring 2014

Page 35: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Source Coding: Lossless

Compression

Source Coding

• Source Coding

• Basic Definitions

Larger Alphabet

Huffman Codes

Class Work

2 / 16

Page 36: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Source Coding: A Simple Example

Source Coding

• Source Coding

• Basic Definitions

Larger Alphabet

Huffman Codes

Class Work

3 / 16

Back to our simple example of a source:

P[A] =1

2,P[B] =

1

4,P[C] =

1

8,P[D] =

1

8

Assumptions

1. One must be able to uniquely recover the source sequence from

the binary sequence

2. One knows the start of the binary sequence at the receiver

3. One would like to minimize the average number of binary digits

per source letter

Page 37: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Source Coding: A Simple Example

Source Coding

• Source Coding

• Basic Definitions

Larger Alphabet

Huffman Codes

Class Work

3 / 16

Back to our simple example of a source:

P[A] =1

2,P[B] =

1

4,P[C] =

1

8,P[D] =

1

8

1. A → 00B → 01 ABAC → 00010010 → ABAC

C → 10D → 11 L = 2

Assumptions

1. One must be able to uniquely recover the source sequence from

the binary sequence

2. One knows the start of the binary sequence at the receiver

3. One would like to minimize the average number of binary digits

per source letter

Page 38: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Source Coding: A Simple Example

Source Coding

• Source Coding

• Basic Definitions

Larger Alphabet

Huffman Codes

Class Work

3 / 16

Back to our simple example of a source:

P[A] =1

2,P[B] =

1

4,P[C] =

1

8,P[D] =

1

8

1. A → 0 AABD → 00110 → CBBA

B → 1 (→ CBD)C → 10 (→ AABD)D → 11 L = 5

4

This code is useless. Why?

Assumptions

1. One must be able to uniquely recover the source sequence from

the binary sequence

2. One knows the start of the binary sequence at the receiver

3. One would like to minimize the average number of binary digits

per source letter

Page 39: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Source Coding: A Simple Example

Source Coding

• Source Coding

• Basic Definitions

Larger Alphabet

Huffman Codes

Class Work

3 / 16

Back to our simple example of a source:

P[A] =1

2,P[B] =

1

4,P[C] =

1

8,P[D] =

1

8

1. A → 0 ABACD → 0100110111 → ABACD

B → 10C → 110D → 111 L = 7

4

Minimum length code satisfying Assumptions 1 and 2!

Assumptions

1. One must be able to uniquely recover the source sequence from

the binary sequence

2. One knows the start of the binary sequence at the receiver

3. One would like to minimize the average number of binary digits

per source letter

Page 40: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Source Coding: A Simple Example

Source Coding

• Source Coding

• Basic Definitions

Larger Alphabet

Huffman Codes

Class Work

3 / 16

Back to our simple example of a source:

P[A] =1

2,P[B] =

1

4,P[C] =

1

8,P[D] =

1

8

1. A → 0 ABACD → 0100110111 → ABACD

B → 10C → 110D → 111 L = 7

4

Minimum length code satisfying Assumptions 1 and 2!

Assumptions

1. One must be able to uniquely recover the source sequence from

the binary sequence

2. One knows the start of the binary sequence at the receiver

3. One would like to minimize the average number of binary digits

per source letter

Page 41: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Source Coding: Basic Definitions

Source Coding

• Source Coding

• Basic Definitions

Larger Alphabet

Huffman Codes

Class Work

4 / 16

Codeword (aka Block Code)

Each source symbol is represented by some sequence of coded

symbols called a code word

Non-Singular Code

Code words are distinct

Uniquely Decodable (U.D.) Code

Every distinct concatenation of m code words is distinct for every

finite m

Instantaneous Code

A U.D. Code where we can decode each code word without seeing

subsequent code words

Page 42: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Source Coding: Basic Definitions

Source Coding

• Source Coding

• Basic Definitions

Larger Alphabet

Huffman Codes

Class Work

4 / 16

Example

Back to the simple case of 4-letter DMS:

Source Symbols Code 1 Code 2 Code 3 Code 4

A 0 00 0 0

B 1 01 10 01

C 00 10 110 011

D 01 11 111 111

Non-Singular Yes Yes Yes Yes

U.D. No Yes Yes Yes

Instan Yes Yes Yes

A NECESSARY AND SUFFICIENT CONDITION for a code to be

instantaneous is that no code word be a PREFIX of any other code

word.

Page 43: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Coding Several Source Symbol at

a Time

Source Coding

Larger Alphabet

• Example 1

Huffman Codes

Class Work

5 / 16

Page 44: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Example 1: 3 letter DMS

Source Coding

Larger Alphabet

• Example 1

Huffman Codes

Class Work

6 / 16

Source Symbols Probability U.D. Code

A .5 0

B .35 10

C .85 11

L1 = 1.5 (Bits / Symbol)

Page 45: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Example 1: 3 letter DMS

Source Coding

Larger Alphabet

• Example 1

Huffman Codes

Class Work

6 / 16

Source Symbols Probability U.D. Code

A .5 0

B .35 10

C .85 11

L1 = 1.5 (Bits / Symbol)

Let us consider two consecutive source symbols at a time:

2 Symbols Probability U.D. Code

AA .25 01

AB .175 11

AC .075 0010

BA .175 000

BB .1225 101

BC .0525 1001

CA .075 0011

CB .0525 10000

CC .0225 10001

L2 = 2.9275 (Bits/ 2 Symbols)L2

2= 1.46375 (Bits / Symbol)

Page 46: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Example 1: 3 letter DMS

Source Coding

Larger Alphabet

• Example 1

Huffman Codes

Class Work

6 / 16

In other words,

1. It is more efficient to build a code for 2 source symbols!

2. Is it possible to decrease the length more and more by

increasing the alphabet size?

To see the answer to the above question, it is useful if we can say

precisely characterize the best code. The codes given above are

Huffman Codes. The procedure for making Huffman Codes will be

described next.

Page 47: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Minimizing average length

Source Coding

Larger Alphabet

Huffman Codes

• Binary Huffman Code

• Example I

• Example II

• Optimality

• Shannon-Fano Codes

Class Work

7 / 16

Page 48: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Binary Huffman Codes

Source Coding

Larger Alphabet

Huffman Codes

• Binary Huffman Code

• Example I

• Example II

• Optimality

• Shannon-Fano Codes

Class Work

8 / 16

Binary Huffman Codes

1. Order probabilities - Highest to Lowest

2. Add two lowest probabilities

3. Reorder probabilities

4. Break ties in any way you want

Page 49: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Binary Huffman Codes

Source Coding

Larger Alphabet

Huffman Codes

• Binary Huffman Code

• Example I

• Example II

• Optimality

• Shannon-Fano Codes

Class Work

8 / 16

Binary Huffman Codes

1. Order probabilities - Highest to Lowest

2. Add two lowest probabilities

3. Reorder probabilities

4. Break ties in any way you want

Example

1. {.1, .2, .15, .3, .25}order→ {.3, .25, .2, .15, .1}

2. {.3, .25, .2, .15, .1︸ ︷︷ ︸

.25

}

3. Get either {.3, (.15, .1︸ ︷︷ ︸

.25

), .25, .2} or

{.3, .25, (.15, .1︸ ︷︷ ︸

.25

), .2}

Page 50: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Binary Huffman Codes

Source Coding

Larger Alphabet

Huffman Codes

• Binary Huffman Code

• Example I

• Example II

• Optimality

• Shannon-Fano Codes

Class Work

8 / 16

Binary Huffman Codes

1. Order probabilities - Highest to Lowest

2. Add two lowest probabilities

3. Reorder probabilities

4. Break ties in any way you want

5. Assign 0 to top branch and 1 to bottom branch (or vice versa)

6. Continue until we have only one probability equal to 1

7. L = Sum of probabilities of combined nodes (i.e., the circled

ones)

Page 51: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Binary Huffman Codes

Source Coding

Larger Alphabet

Huffman Codes

• Binary Huffman Code

• Example I

• Example II

• Optimality

• Shannon-Fano Codes

Class Work

8 / 16

Binary Huffman Codes

1. Order probabilities - Highest to Lowest

2. Add two lowest probabilities

3. Reorder probabilities

4. Break ties in any way you want

5. Assign 0 to top branch and 1 to bottom branch (or vice versa)

6. Continue until we have only one probability equal to 1

7. L = Sum of probabilities of combined nodes (i.e., the circled

ones)

Optimality of Huffman Coding

1. Binary Huffman code will have the shortest average length as

compared with any U.D. Code for set of probabilities.

2. The Huffman code is not unique. Breaking ties in different ways

can result in very different codes. The average length, however,

will be the same for all of these codes.

Page 52: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Huffman Coding: Example

Source Coding

Larger Alphabet

Huffman Codes

• Binary Huffman Code

• Example I

• Example II

• Optimality

• Shannon-Fano Codes

Class Work

9 / 16

Example Continued

1. {.1, .2, .15, .3, .25}order→ {.3, .25, .2, .15, .1}

Page 53: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Huffman Coding: Example

Source Coding

Larger Alphabet

Huffman Codes

• Binary Huffman Code

• Example I

• Example II

• Optimality

• Shannon-Fano Codes

Class Work

9 / 16

Example Continued

1. {.1, .2, .15, .3, .25}order→ {.3, .25, .2, .15, .1}

L = .25 + .45 + .55 + 1 = 2.25

Page 54: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Huffman Coding: Example

Source Coding

Larger Alphabet

Huffman Codes

• Binary Huffman Code

• Example I

• Example II

• Optimality

• Shannon-Fano Codes

Class Work

9 / 16

Example Continued

1. {.1, .2, .15, .3, .25}order→ {.3, .25, .2, .15, .1}

Or

L = .25 + .45 + .55 + 1 = 2.25

Page 55: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Huffman Coding: Tie Breaks

Source Coding

Larger Alphabet

Huffman Codes

• Binary Huffman Code

• Example I

• Example II

• Optimality

• Shannon-Fano Codes

Class Work

10 / 16

In the last example, the two ways of breaking the tie led to two

different codes with the same set of code lengths. This is not always

the case — Sometimes we get different codes with different code

lengths.

EXAMPLE:

Page 56: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Huffman Coding: Optimal Average Length

Source Coding

Larger Alphabet

Huffman Codes

• Binary Huffman Code

• Example I

• Example II

• Optimality

• Shannon-Fano Codes

Class Work

11 / 16

• Binary Huffman Code will have the shortest average length as

compared with any U.D. Code for set of probabilities (No U.D. will

have a shorter average length).

◦ The proof that a Binary Huffman Code is optimal — that is,

has the shortest average code word length as compared with

any U.D. code for that the same set of probabilities — is

omitted.

◦ However, we would like to mention that the proof is based on

the fact that in the process of constructing a Huffman Code

for that set of probabilities other codes are formed for other

sets of probabilities, all of which are optimal.

Page 57: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Shannon-Fano Codes

Source Coding

Larger Alphabet

Huffman Codes

• Binary Huffman Code

• Example I

• Example II

• Optimality

• Shannon-Fano Codes

Class Work

12 / 16

SHANNON - FANO CODES is another binary coding technique to

construct U.D. codes (not necessarily optimum!)

1. Order probabilities in decreasing order.

2. Partition into 2 sets that one as close to equally probable as

possible. Label top set with a ”‘0”’ and bottom set with a ”‘1”’.

3. Continue using step 2 over and over

Page 58: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Shannon-Fano Codes

Source Coding

Larger Alphabet

Huffman Codes

• Binary Huffman Code

• Example I

• Example II

• Optimality

• Shannon-Fano Codes

Class Work

12 / 16

SHANNON - FANO CODES is another binary coding technique to

construct U.D. codes (not necessarily optimum!)

1. Order probabilities in decreasing order.

2. Partition into 2 sets that one as close to equally probable as

possible. Label top set with a ”‘0”’ and bottom set with a ”‘1”’.

3. Continue using step 2 over and over

.5 ⇒ 0

.2 ⇒ 1 0.15 ⇒ 1 1 0.15 ⇒ 1 1 1

.4 ⇒ 0

.3 ⇒ 1 0

.1 ⇒ 1 1 0 0.05 ⇒ 1 1 0 1.05 ⇒ 1 1 1 0.05 ⇒ 1 1 1 1 0.05 ⇒ 1 1 0 1 1

L =? L = 2.3

Compare with Huffman coding. Same length as Huffman code?!

Page 59: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

More Examples

Source Coding

Larger Alphabet

Huffman Codes

Class Work

• More Examples

• More Examples

• More Examples

13 / 16

Page 60: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Binary Huffman Codes

Source Coding

Larger Alphabet

Huffman Codes

Class Work

• More Examples

• More Examples

• More Examples

14 / 16

Construct binary Huffman and Shannon-Fano codes where:

EXAMPLE 1: (p1, p2, p3, p4) = (12, 14, 18, 18)

EXAMPLE 2: Consider the examples on the previous slide and

construct binary Huffman codes.

Page 61: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Large Alphabet Size

Source Coding

Larger Alphabet

Huffman Codes

Class Work

• More Examples

• More Examples

• More Examples

15 / 16

Example 3: Consider a binary Source {A,B}(p1, p2) = (.9, .1). Now construct a series of Huffman Codes and

series of Shannon-Fano Codes, by encoding N source symbols at a

time for N = 1, 2, 3, 4.

Page 62: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Shannon-Fano codes are suboptimal !

Source Coding

Larger Alphabet

Huffman Codes

Class Work

• More Examples

• More Examples

• More Examples

16 / 16

Example 3: Construct a Shannon-Fano code:

2× .25 0 02× .20 0 1

.6 3× .15 1 0 0

.7 3× .10 1 0 1.75 4× .05 1 1 0 0.8 5× .05 1 1 0 1 0

.85 5× .05 1 1 0 1 1.9 4× .05 1 1 1 0

5× .04 1 1 1 1 06× .03 1 1 1 1 1 06× .03 1 1 1 1 1 1

L = 3.11

Compare this with a binary Huffman Code.

Page 63: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

1 / 17

Digital Communications III (ECE 154C)

Introduction to Coding and Information Theory

Tara Javidi

These lecture notes were originally developed by late Prof. J. K. Wolf.

UC San Diego

Spring 2014

Page 64: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Noiseless Source Coding

Continued

Source Coding

• n-ary Huffman

Coding

Beyond Huffman

Limpel-Ziv

2 / 17

Page 65: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Non-binary Huffman Coding

Source Coding

• n-ary Huffman

Coding

Beyond Huffman

Limpel-Ziv

3 / 17

• The objective is to create a Huffman Code where the code words

are from an alphabet with n letter is to:

1. Order probabilities high to low (perhaps with an extra symbol

with probability 0)

2. Combine n least likely probabilities. Add them and re-order.

3. End up with n symbols(i.e. probabilities)!!!

Example 1: A source with alphabet {A,B,C,D,E} and

probabilities (.5, .3, .1, .08, .02) coded into ternary stream n = 3:

Page 66: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Non-binary Huffman Coding

Source Coding

• n-ary Huffman

Coding

Beyond Huffman

Limpel-Ziv

4 / 17

Example 2: n = 3 {A,B,C,D}(p1, p2, p3, p4) = (.5, .3, .1, .1)

Page 67: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Non-binary Huffman Coding

Source Coding

• n-ary Huffman

Coding

Beyond Huffman

Limpel-Ziv

4 / 17

Example 2: n = 3 {A,B,C,D}(p1, p2, p3, p4) = (.5, .3, .1, .1)

Placeholder Figure A L1 = 1.5 SUBOPTIMAL

Placeholder Figure A L1 = 1.2 OPTIMAL

Page 68: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Non-binary Huffman Coding

Source Coding

• n-ary Huffman

Coding

Beyond Huffman

Limpel-Ziv

4 / 17

Example 2: n = 3 {A,B,C,D}(p1, p2, p3, p4) = (.5, .3, .1, .1)

• Sometimes one has to add Phantom Source Symbols with 0probability in order to make a Non-Binary Huffman Code.

• How many?

◦ If one starts with M source symbols and one combines the nleast likely into one symbol, one is left with M − (n− 1)symbols.

◦ After doing this α times, one is left with M − α(n− 1)symbols.

◦ But at the end we must be left with n symbols. If this is not

the case, we must add Phantom Symbols.

• Add D Phantom Symbols to insure that

M +D − α(n− 1) = n or (M +D) = α′(n− 1) + 1

Page 69: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Non-binary Huffman Coding

Source Coding

• n-ary Huffman

Coding

Beyond Huffman

Limpel-Ziv

5 / 17

EXAMPLES:

n = 3M D

3 0

4 1

5 0

6 1

7 0

8 1

9 0

10 1

n = 4M D

4 0

5 2

6 1

7 0

8 2

9 1

10 0

11 1

n = 5M D

5 0

6 3

7 2

8 1

9 0

10 3

11 2

12 1

n = 6M D

6 0

7 4

8 3

9 2

10 1

11 0

12 4

13 3

NOTE :

• M +D − 1 must be divisible by n− 1.

EX: n = 3 ⇒ M +D − 1 must be even

• D ≤ n− 2

Page 70: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Beyond Huffman Codes

Source Coding

Beyond Huffman

• Fax

• Tunstall Code

• Tunstall Code

• Tunstall Code

• Tunstall Code

• Tunstall Code

• Tunstall Code

Limpel-Ziv

6 / 17

Page 71: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Run Length Codes for Fax (B/W)

Source Coding

Beyond Huffman

• Fax

• Tunstall Code

• Tunstall Code

• Tunstall Code

• Tunstall Code

• Tunstall Code

• Tunstall Code

Limpel-Ziv

7 / 17

Page 72: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Variable-length to Fixed-length Source Coding

Source Coding

Beyond Huffman

• Fax

• Tunstall Code

• Tunstall Code

• Tunstall Code

• Tunstall Code

• Tunstall Code

• Tunstall Code

Limpel-Ziv

8 / 17

Previously we only considered the situation where we encoded Nsource symbols into variable length code sequences for a

fixed value of N . We could call this ”‘fixed length to variable length”’

encoding. But another possibility exists. We could encode variable

length source sequences into fixed or variable length code words.

Example 1: Consider the DMS source {A,B} with probabilities

(.9, .1) and the following code book

Source Sequences Codewords

B 00AB 01AAB 10AAA 11

Average length of source phrase

= 1× .1 + 2× .09 + 3× (.081 + .70) = 2.75Average # of code symbols/source symbols = 2

2.71= 0.738

Page 73: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Tunstall Codes

Source Coding

Beyond Huffman

• Fax

• Tunstall Code

• Tunstall Code

• Tunstall Code

• Tunstall Code

• Tunstall Code

• Tunstall Code

Limpel-Ziv

9 / 17

Tunstall Codes are U.D. Variable- to Fixed- length codes with binary

code words

Basic Idea – Encode into binary code words of fixed length L, make

2L source phrases that are as nearly equally probable as we can

We do this by making the source phrases as leaves of a tree and

always splitting the leaf with the highest probability.

Page 74: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Tunstall Codes

Source Coding

Beyond Huffman

• Fax

• Tunstall Code

• Tunstall Code

• Tunstall Code

• Tunstall Code

• Tunstall Code

• Tunstall Code

Limpel-Ziv

10 / 17

Example 2: (A,B,C,D) with (p1, p2, p3, p4) = (.5, .3, .1, .1)

Placeholder Figure A

Source Symbol Code word Source Symbol Code word

D 0000 C 0001

BB 0010 AB 1001

BC 0011 AC 1010

BD 0100 AD 1011

BAA 0101 AAA 1100

BAB 0110 AAB 1101

BAC 0111 AAC 1110

BAD 1000 AAD 1111

Average length of source phrase = Sum of probabilities of internal

nodes) = 1 + .5 + .3 + .25 + .15 = 2.2Average number of code symbols/source symbols = 4/2.2 = 1.82

Page 75: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Improved Tunstall Coding

Source Coding

Beyond Huffman

• Fax

• Tunstall Code

• Tunstall Code

• Tunstall Code

• Tunstall Code

• Tunstall Code

• Tunstall Code

Limpel-Ziv

11 / 17

• Since the phrases are not equally probable, one can use a

Huffman Code on the phrases.

• The result is encoding a variable number of source symbols into

a variable number of code symbols.

Example 3: Back to Example 1 with (A,B) with (.9, .1)

We have seen that Tunstall alone I = 2.71 Av = 2

2.71= .738

Source Phrases Tunstall Code Probability Improved Tunstall (Huffman)

AAA 11 0.729 0

B 00 0.1 11

AB 01 0.09 100

AAB 10 0.081 101

Average # of code symbols per source symbol

=(1+.271+.171)(1+.9+.81) =

1.4422.71 = .532

Page 76: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Summary of Results for (A,B) = (.9, .1)

Source Coding

Beyond Huffman

• Fax

• Tunstall Code

• Tunstall Code

• Tunstall Code

• Tunstall Code

• Tunstall Code

• Tunstall Code

Limpel-Ziv

12 / 17

All of the following use 4 code words in coding table:

1. Huffman Code, N = 2

AA −→ 0AB −→ 11BA −→ 100BB −→ 101

2. Shannon-Fano Code N = 2

AA −→ 0AB −→ 10BA −→ 110BB −→ 111

Page 77: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Summary of Results for (A,B) = (.9, .1)

Source Coding

Beyond Huffman

• Fax

• Tunstall Code

• Tunstall Code

• Tunstall Code

• Tunstall Code

• Tunstall Code

• Tunstall Code

Limpel-Ziv

13 / 17

1. Tunstall Code

B −→ 00AB −→ 01

AAB −→ 10AAA −→ 11

2. Tunstall/Huffman

B −→ 11AB −→ 100

AAA −→ 0AAB −→ 101

Page 78: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Limpel-Ziv Source Coding

Source Coding

Beyond Huffman

Limpel-Ziv

• Lempel-Ziv

• Lempel-Ziv

• Lempel-Ziv

14 / 17

Page 79: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Lempel-Ziv Source Coding

Source Coding

Beyond Huffman

Limpel-Ziv

• Lempel-Ziv

• Lempel-Ziv

• Lempel-Ziv

15 / 17

• The basic idea is that if we have a dictionary of 2A source

phrases (Available at both the encoder and the decoder) in order

to encode on of these phrases one needs only A binary digits.

• Normally, a computer stores each symbol as an ASCII character

of 8 binary digits. (Actually only 7 are needed)

• Using L-Z encoding, far less than 7 binary digits per symbol are

needed. Typically the compression is about 2:1 or 3:1.

• Variants of L-Z codes was the algorithm of the widely used Unix

file compression utility compress as well as gzip. Several other

popular compression utilities also used L-Z, or closely related

encoding.

• LZ became very widely used when it became part of the GIF

image format in 1987. It may also (optionally) be used in TIFF

and PDF files.

• There are two versions of L-Z codes. We will only discuss the

“window” version.

Page 80: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Lempel-Ziv Source Coding

Source Coding

Beyond Huffman

Limpel-Ziv

• Lempel-Ziv

• Lempel-Ziv

• Lempel-Ziv

16 / 17

• In (the window version of) Lempel Ziv, symbols that have already

been encoded are stored in a window.

• The encoder then looks at the next symbols to be encoder to find

the longest string that is in the window that matches the source

symbols to be encoded.

◦ If it can’t find the next symbol in the window, it sends a ”‘0”’

followed by the 8(or 7) bits of the ASCII character.

◦ If it finds a sequence of one or more symbols in the window, it

sends a ”‘1”’ followed by the bit position of the first symbol in

the match followed by the length of the match. These latter

two quantities are encoded into binary.

◦ Then the sequence that was just encoded is put into the

window.

Page 81: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Lempel-Ziv Source Coding

Source Coding

Beyond Huffman

Limpel-Ziv

• Lempel-Ziv

• Lempel-Ziv

• Lempel-Ziv

17 / 17

Example: Suppose the content of the window is given as

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

T H E T H R E E A R E I N

The next word, assuming it is “THE ”, will be encoded as

(1, “15”︸︷︷︸

4bits

,

?bits︷︸︸︷

”4” )

And then the windows content will be updated as15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

T H R E E A R E I N T H E

Page 82: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Lempel-Ziv Source Coding

Source Coding

Beyond Huffman

Limpel-Ziv

• Lempel-Ziv

• Lempel-Ziv

• Lempel-Ziv

17 / 17

EXAMPLE: Encode the text

”MY MY WHAT A HAT IS THAT”

16 BIT WINDOW

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

M

M Y

M Y

M Y M Y

M Y M Y W

M Y M Y W H

M Y M Y W H A

M Y M Y W H A T

M Y M Y W H A T

M Y M Y W H A T A

M Y M Y W H A T A

M Y M Y W H A T A H A T

M Y W H A T A H A T I

M Y W H A T A H A T I S

M Y W H A T A H A T I S

Y W H A T A H A T I S T

W H A T A H A T I S T H A T

# of 1 bits

(0,M) 9

(0, Y ) 9

(0, ) 9

(1, 2, 3) 8

(1, 2, 3) 8

(0,W ) 9

(0, H) 9

(0, A) 9

(0, T ) 9

(1, 4, 1) 6

(1, 2, 1) 6

(1, 1, 1) 6

(1, 5, 4) 9

(0, 1) 9

(0, 5) 9

(1, 2, 1) 6

(1, 4, 1) 6

(1, 7, 3) 8

144

Page 83: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Lempel-Ziv Source Coding

Source Coding

Beyond Huffman

Limpel-Ziv

• Lempel-Ziv

• Lempel-Ziv

• Lempel-Ziv

17 / 17

No match (0,M) −→ 1 bit more than needed for a symbol (1 + 8 = 9)

Match(1, , , )↓ ↓ ց

1 bits depends on depends on the code used

window size to encode lengths

One really simple code for this purpose might be

1 −→ 02 −→ 103 −→ 110

...

What are the advantages/disadvantages of this code?

Page 84: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

1 / 12

Digital Communications III (ECE 154C)

Introduction to Coding and Information Theory

Tara Javidi

These lecture notes were originally developed by late Prof. J. K. Wolf.

UC San Diego

Spring 2014

Page 85: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Noiseless Source Coding

Fundamental Limits

Source Coding

• Entropy

• A Useful Theorem

• A Useful Theorem

• A Useful Theorem

• More on Entropy

• More on Entropy

• Fundamental Limits

• More on Entropy

• Source Coding

Theorem

• Source Coding

Theorem

2 / 12

Page 86: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Shannon Entropy

Source Coding

• Entropy

• A Useful Theorem

• A Useful Theorem

• A Useful Theorem

• More on Entropy

• More on Entropy

• Fundamental Limits

• More on Entropy

• Source Coding

Theorem

• Source Coding

Theorem

3 / 12

• Let an I.I.D. source S, have M source letters that occur with

probabilities p1, p2, ..., pM ,∑

M

i=1pi = 1

• The entropy of the source S is denoted H(S) and is defined as

Ha(S) =

M∑

i=1

pi loga1

pi= −

M∑

i=1

pi loga pi

Ha(S) = E

[

loga1

pi

]

• The base of the logarithms is usually taken to be equal to 2. In

that case, H2(S) is simply written as H(S) and is measured in

units of “bits”.

Page 87: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Shannon Entropy

Source Coding

• Entropy

• A Useful Theorem

• A Useful Theorem

• A Useful Theorem

• More on Entropy

• More on Entropy

• Fundamental Limits

• More on Entropy

• Source Coding

Theorem

• Source Coding

Theorem

4 / 12

• Other bases can be used and easily transformed to. Note:

loga x = (logb x)(loga b) =(logb x)

(logb a)

Hence,

Ha(S) = Hb(S) · loga b =Hb(S)

logb a

A Useful Theorem

Let p1, p2, ..., pM be one set of probabilities and let p′1, p′

2, ..., p′

M

be another (note∑

M

i=1pi = 1,

M

i=1p′i= 1). Then

M∑

i=1

pi log1

pi≤

M∑

i=1

pi log1

p′i

,

with equality iff pi = p′i

for i = 1, 2, ...,M

Page 88: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Shannon Entropy

Source Coding

• Entropy

• A Useful Theorem

• A Useful Theorem

• A Useful Theorem

• More on Entropy

• More on Entropy

• Fundamental Limits

• More on Entropy

• Source Coding

Theorem

• Source Coding

Theorem

5 / 12

Proof.

First note that lnx ≤ x− 1 with equality iff x = 1.

Placeholder Figure A

M∑

i=1

pi logp′i

pi=

(

M∑

i=1

pi lnp′i

pi

)

log e

Page 89: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Shannon Entropy

Source Coding

• Entropy

• A Useful Theorem

• A Useful Theorem

• A Useful Theorem

• More on Entropy

• More on Entropy

• Fundamental Limits

• More on Entropy

• Source Coding

Theorem

• Source Coding

Theorem

6 / 12

proof continued.

(

M∑

i=1

pi lnp′i

pi

)

log e ≤ (log e)

M∑

i=1

pi

(

p′i

pi− 1

)

= (log e)

(

M∑

i=1

p′i −

M∑

i=1

pi

)

= 0

In other words,

M∑

i=1

pi log1

pi≤

M∑

i=1

pi log1

p′i

Page 90: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Shannon Entropy

Source Coding

• Entropy

• A Useful Theorem

• A Useful Theorem

• A Useful Theorem

• More on Entropy

• More on Entropy

• Fundamental Limits

• More on Entropy

• Source Coding

Theorem

• Source Coding

Theorem

7 / 12

1. For an equally likely i.i.d. source with M source letters

H2(S) = log2 M (aHa(S) = logaM,anya)2. For any i.i.d. source with M source letters

0 ≤ H2(S) ≤ log2 M, for alla

This follows from previous theorem with p′i= 1

Mall i.

3. Consider an iid source with source alphabet S. If we consider

encoding M source letters at a time, this is an iid source with

alphabet SM for source letters. Call this the M tuple extension

of the source and denote it is by SM . Then

H2(SM ) = mH2(S),

(and similarly for all a, Ha(SM ) = mHa(S)).

The proofs are omitted but are easy.

Page 91: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Computation of Entropy (base 2)

Source Coding

• Entropy

• A Useful Theorem

• A Useful Theorem

• A Useful Theorem

• More on Entropy

• More on Entropy

• Fundamental Limits

• More on Entropy

• Source Coding

Theorem

• Source Coding

Theorem

8 / 12

Example 1: Consider a source with M = 2 and

(p1, p2) = (0.9, 0.1).

H2(S) = .9 log21

.9+ .1 log2

1

1.1= 0.469bits

From before we gave Huffman Codes for this source and extensions

of this source for which

L1 = 1

L 2

2

= 0.645

L 3

3

= 0.533

L 4

4

= 0.49

In other words, we note that Lm

m≥ H2(s). Furthermore, as m gets

larger Im

mis getting closer to H2(S).

Page 92: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Performance of Huffman Codes

Source Coding

• Entropy

• A Useful Theorem

• A Useful Theorem

• A Useful Theorem

• More on Entropy

• More on Entropy

• Fundamental Limits

• More on Entropy

• Source Coding

Theorem

• Source Coding

Theorem

9 / 12

−→ One can prove that, in general, for a binary Huffman Code,

H2(S) ≤ Lm

m

< H2(S) +1

m

Example 2: Consider a source with M = 3 and

(p1, p2, p3) = (.5, .35, .15).

H2(S) = .5 log21

.5+ .35 log2

1

.35+ .15 log2

1

.15= 1.44 bits

Again we have already given codes for this source such that

1 symbol at a time L1 = 1.52 symbols at a time L 2

2

= 1.46

Page 93: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Performance of Huffman Codes

Source Coding

• Entropy

• A Useful Theorem

• A Useful Theorem

• A Useful Theorem

• More on Entropy

• More on Entropy

• Fundamental Limits

• More on Entropy

• Source Coding

Theorem

• Source Coding

Theorem

10 / 12

Example 3: Consider M = 4 and (p1, p2, p3, p4) = (12, 14, 18, 18).

H2(S) =1

2log2

1

1/2+

1

4log2

1

1/4+

1

8log2

1

1/8+

1

8log2

1

1/8

= 1.75bits

But from before we gave the code

Source Symbols Codewords

A 0B 10C 110D 111

for which L1 = H2(S).

This means that one cannot improve on the efficiency of this

Huffman code by encoding several source symbols at a time.

Page 94: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Fundamental Source Coding Limit

Source Coding

• Entropy

• A Useful Theorem

• A Useful Theorem

• A Useful Theorem

• More on Entropy

• More on Entropy

• Fundamental Limits

• More on Entropy

• Source Coding

Theorem

• Source Coding

Theorem

11 / 12

Theorem [Lossless Source Coding Theorem]

For any U.D. Binary Code corresponding to the N th extension of the

I.I.D. Source S, for every N = 1, 2, ...

LN

N≥ H(S)

• For a binary Huffman Code corresponding to the N th extension

of the I.I.D. Source S

H2(S) +1

m>

LN

N≥ H2(S)

• But this implies that Huffman coding is asymptotically optimal, i.e.

limN→∞

LN

N−→ H2(S)

Page 95: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

NON-BINARY CODE WORDS

Source Coding

• Entropy

• A Useful Theorem

• A Useful Theorem

• A Useful Theorem

• More on Entropy

• More on Entropy

• Fundamental Limits

• More on Entropy

• Source Coding

Theorem

• Source Coding

Theorem

12 / 12

The code symbols that make up the codewords can be from a higher

order alphabet than 2.

Example 4: I.I.D. Source {A,B,C,D,E} with U.D. codes (where each

concatenation of code words can be decoded in only one way):

SYMBOLS TERNARY QUATERNARY 5-ary

A 0 0 0B 1 1 1C 20 2 2D 21 30 3E 22 31 4

• A lower bound to the average code length of any U.D. n-ary code

(with n-letters) is Hn(S)• For example, for a ternary code, the average length (per source

letter),LM

M, is no less than H3(S).

Page 96: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

1 / 14

Digital Communications III (ECE 154C)

Introduction to Coding and Information Theory

Tara Javidi

These lecture notes were originally developed by late Prof. J. K. Wolf.

UC San Diego

Spring 2014

Page 97: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Noiseless Source Coding

Fundamental Limits

Source Coding

Theorem

• Theorem Statement

• Proof Sketch

Elements of Proof

Sourece Coding

Theorem

2 / 14

Page 98: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Fundamental Source Coding Limit

Source Coding

Theorem

• Theorem Statement

• Proof Sketch

Elements of Proof

Sourece Coding

Theorem

3 / 14

Theorem [Lossless Source Coding Theorem]

For any U.D. n-ary code corresponding to the N th extension of the

I.I.D. Source S, for every N = 1, 2, ...

LN

N≥ Hn(S)

• Furthermore for an n-ary Huffman Code corresponding to the

N th extension of the I.I.D. Source S

Hn(S) +1

N>

LN

N≥ Hn(S)

• But this implies that Huffman coding is asymptotically optimal, i.e.

limN→∞

LN

N−→ Hn(S)

Page 99: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Sketch of the Proof

Source Coding

Theorem

• Theorem Statement

• Proof Sketch

Elements of Proof

Sourece Coding

Theorem

4 / 14

• If one can construct a U.D. code such that li =⌈

logn1pi

, we1

will have

Hn(S) + 1 > L ≥ Hn(S).

• But is it always possible to construct such a code?

• If it is, is it good enough? Can we do better?

• To do this, we first prove useful conditions on a set of integers if

they are the length of U.D. codewords. These conditions are

called Kraft and McMillan inequalities.

1Note that ⌈x⌉ is the unique integer in the interval [x, x+ 1).

Page 100: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Proof of Source Coding Theorem

Source Coding

Theorem

Elements of Proof

• Kraft Inequality

• Kraft Inequality

• Kraft Inequality

• Kraft Inequality

• McMillan Inequality

• McMillan Inequality

Sourece Coding

Theorem

5 / 14

Page 101: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Codeword Lengths: Necessary and Sufficient Condition

Source Coding

Theorem

Elements of Proof

• Kraft Inequality

• Kraft Inequality

• Kraft Inequality

• Kraft Inequality

• McMillan Inequality

• McMillan Inequality

Sourece Coding

Theorem

6 / 14

Theorem [Kraft Inequality] A necessary and sufficient condition for

the construction of an instantaneous n-ary code with M code words

of lengths l1, l2, .., lM where the code symbols take on n different

values is thatM∑

i=1

n−li ≤ 1

PROOF OF SUFFICIENCY:

• We construct an instantaneous code with these code word

lengths. Let there be mj code words of length j for

j = 1, 2, ..., l∗ = max li. Then

M∑

i=1

n−li =

l∗∑

j=1

mjn−j

Page 102: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Kraft Inequality: Proof of sufficiency

Source Coding

Theorem

Elements of Proof

• Kraft Inequality

• Kraft Inequality

• Kraft Inequality

• Kraft Inequality

• McMillan Inequality

• McMillan Inequality

Sourece Coding

Theorem

7 / 14

• In other words, if∑M

i=1 n−li ≤ 1, then

∑l∗

j=1mjnl∗−1 ≤ nl∗ .

Or,

ml∗ +m(l∗−1)n+m(l∗−2)n2 + ...+m1n

l∗−1≤ nl∗ ,

and, equivalently

ml∗ ≤ nl∗−m1n

l∗−1−m2n

l∗−2t− . . .−ml∗−1n (1)

• But since ml∗ ≥ 0 we then have

0 ≤ nl∗−m1n

l∗−1−m2n

l∗−2t− . . .−ml∗−1n

and can repeat the procedure to arrive at

ml∗−1 ≤ nl∗−1−m1n

l∗−2−m2n

l∗−3− . . .−ml∗−2n (2)

Page 103: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Codeword Lengths: Necessary and Sufficient Condition

Source Coding

Theorem

Elements of Proof

• Kraft Inequality

• Kraft Inequality

• Kraft Inequality

• Kraft Inequality

• McMillan Inequality

• McMillan Inequality

Sourece Coding

Theorem

8 / 14

• But dividing by n and noting that ml∗−1 ≥ 0 we have

0 ≤ nl∗−2−m1n

l∗−3−m2n

l∗−4− . . .−ml∗−2

equivalently

ml∗−2 ≤ nl∗−2−m1n

l∗−3−m2n

l∗−4− . . . (3)

• Continuing we get

0 ≤ m3 ≤ n3 −m1n2 −m2n (4)

0 ≤ m2 ≤ n2 −m1n (5)

m1 ≤ n (6)

• Note that if

M∑

i=1

n−li ≤ 1 then the mj satisfy (1)-(6).

Page 104: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Kraft Inequality: Proof of sufficiency

Source Coding

Theorem

Elements of Proof

• Kraft Inequality

• Kraft Inequality

• Kraft Inequality

• Kraft Inequality

• McMillan Inequality

• McMillan Inequality

Sourece Coding

Theorem

9 / 14

Note that m1 ≤ n.

If m1 = n then we are done (assigning each codeword a letter).

If m1 < n, we have (n−m1) unused prefixes to form code words

of length 2 for which the code words of length 1 are not prefixes.

This means that there are (n−m1)n phrases of length 2 to select

the codewords from. But this is larger than or equal to the number of

codewords of length 2, m2 according to equation (5). So we can

construct our code by selecting m2 of the (n−m1)n prefix-free

phrases of length 2

If m2 = (n−m1)n we are done. If m2 < n2 −m1n, there are

(n2 −m1n−m2)n code words of length 3 which satisfy the prefix

condition. etc.

Page 105: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Kraft Inequality: Proof of necessity

Source Coding

Theorem

Elements of Proof

• Kraft Inequality

• Kraft Inequality

• Kraft Inequality

• Kraft Inequality

• McMillan Inequality

• McMillan Inequality

Sourece Coding

Theorem

10 / 14

Proof of necessity follows from McMillan inequality

Theorem [McMillan Inequality]

A necessary and a sufficient condition for the existence of a U.D.

code with M code words of length l1, l2, ..., lM where the code

symbols take on n different values is:

M∑

i=1

n−li ≤ 1.

Here we sketch the proof of necessity of McMillan Inequality (enough

to prove the necessity of Kraft inequality). Proof by contradiction:

1. Assume

M∑

i=1

n−li = A > 1 and a U.D. code exists.

2. If

(

M∑

i=1

n−li > 1

)

, then

[

M∑

i=1

n−li

]N

≈ AN.

Page 106: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Kraft Inequality: Proof of necessity

Source Coding

Theorem

Elements of Proof

• Kraft Inequality

• Kraft Inequality

• Kraft Inequality

• Kraft Inequality

• McMillan Inequality

• McMillan Inequality

Sourece Coding

Theorem

11 / 14

3.

(

M∑

i=1

n−li

)N

=

l∗∑

j=1

mjn−j

N

=Nl∗∑

k=N

Nkn−k

where

Nk =∑

i1+i2+...+iN=k

mi1mi2 · · ·miN

and it denotes the # of strings of N code words that are the

length of exactly k.

4. If the code is U.D., Nk ≤ nk. But then for a U.D. code

(

M∑

i=1

n−li

)N

=

(

Nl∗∑

k=N

Nkn−k

)

Nl∗∑

k=N

1 = Nl∗ −N + 1

Which grows linearly with N , not exponentially with N ,

contradicting bullet point 2.

Page 107: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Complete Proof

Source Coding

Theorem

Elements of Proof

Sourece Coding

Theorem

• Proof

• Proof

12 / 14

Page 108: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Proof of Source Coding Theorem

Source Coding

Theorem

Elements of Proof

Sourece Coding

Theorem

• Proof

• Proof

13 / 14

Lower bound for L for a U.D. Code

Theorem: For any instantaneous code 2, L ≥ Hn(S).Furthermore, L = Hn(S) iff pi = n−li .

Proof: Let p′

i =n−li

M∑

j=1

n−lj

. Note p′i ≥ 0 and

M∑

i=1

p′i = 1.

From before:

Hn(S) =

M∑

i=1

pi logn1

pi≤

M∑

i=1

pi logn1

p′i

2Here L = Average length of the U.D. Code and n = number of symbols in code

alphabet.

Page 109: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Proof of Source Coding Theorem

Source Coding

Theorem

Elements of Proof

Sourece Coding

Theorem

• Proof

• Proof

14 / 14

Then:

Hn(S) ≤M∑

i=1

pili +M∑

i=1

pi logn

M∑

j=1

n−lj

But for a U.D. code,

M∑

j=1

n−lj ≤ 1, so logn

M∑

j=1

n−lj

≤ 0

Hn(S) ≤ L

Equality occurs if and only if

M∑

j=1

n−lj = 1 and pi = p′i.

But both of these conditions hold if pi = n−li , li is an integer.

Page 110: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

1 / 12

Digital Communications III (ECE 154C)

Introduction to Coding and Information Theory

Tara Javidi

These lecture notes were originally developed by late Prof. J. K. Wolf.

UC San Diego

Spring 2014

Page 111: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Asymptotic Optimality

Huffman Codes

• Optimality

• Proof Sketch

Examples

Optimality Proof

2 / 12

Page 112: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Asymptotic Optimality of Huffman Codes

Huffman Codes

• Optimality

• Proof Sketch

Examples

Optimality Proof

3 / 12

We have seen that for any U.D. n-ary code corresponding to the

N th extension of the I.I.D. Source S, for every N = 1, 2, ...,LN

N≥ Hn(S).

→ How do you get the result for general N?

• Next we show that for an n-ary Huffman Code corresponding to

the N th extension of the I.I.D. Source S

Hn(S) +1

N>

LN

N≥ Hn(S)

• But this implies that Huffman coding is asymptotically optimal, i.e.

limN→∞

LN

N−→ Hn(S)

Page 113: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Sketch of the Proof

Huffman Codes

• Optimality

• Proof Sketch

Examples

Optimality Proof

4 / 12

• From McMillan Inequality, we have that one can construct a U.D.

code such that li =⌈

logn1

pi

. In other words, we can construct

a U.D. code for which we have

Hn(S) + 1 > L ≥ Hn(S).

• Why?

Page 114: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Asymptotic Optimality of

Huffman Coding

Examples

Huffman Codes

Examples

• Example I

• Example II

• Example III

Optimality Proof

5 / 12

Page 115: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Huffman Codes are asymptotically optimal

Huffman Codes

Examples

• Example I

• Example II

• Example III

Optimality Proof

6 / 12

EX 1: n = 2, (.9, .09, .01) ⇒ H2(S) = .516

Prob log21

pili =

log21

pi

A .9 .152 1B .09 3.47 4C .01 6.67 7

L = 1.33

Note thatHn(S) ≤ L < Hn(S) + 1.516 ≤ 1.33 < 1.516

A better code (this is actually a Huffman Code:)

A 0B 10C 11

L = 1.1

Page 116: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Huffman Codes are asymptotically optimal

Huffman Codes

Examples

• Example I

• Example II

• Example III

Optimality Proof

7 / 12

EX 2: n = 2, (.19, .19, .19, .19, .19.05) ⇒ H2(S) = 2.492

Prob log21

pili =

log21

pi

A .19 2.396 3B .19 2.396 3C .19 2.396 3D .19 2.396 3E .19 2.396 3F .05 4.322 5

L = 3.10

Huffman Code:A 00 D 101B 01 E 110C 100 F 111

L = 2.62

Page 117: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Huffman Codes are asymptotically optimal

Huffman Codes

Examples

• Example I

• Example II

• Example III

Optimality Proof

8 / 12

EX 3: n = 3, (.19, .19, .19, .19, .19.05) ⇒ H2(S) = 2.492

Prob log31

pili =

log31

pi

A .19 1.51 2B .19 1.51 2C .19 1.51 2D .19 1.51 2E .19 1.51 2F .05 2.73 3

L = 1.57

Huffman Code:A 0 D 12B 10 E 20C 11 F 21

L = 1.81

Page 118: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Performance of Huffman Codes

Huffman Codes

Examples

Optimality Proof

• Optimality

• Necessary Conditions

• Proof

9 / 12

Page 119: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Asymptotic Optimality of Huffman Codes

Huffman Codes

Examples

Optimality Proof

• Optimality

• Necessary Conditions

• Proof

10 / 12

Theorem: For an I.I.D. source S, a Huffman Code exists with code

alphabet size n for the N th extension of this source such that

Hn(S) ≤LN

N< Hn(S) + 1

Proof: Earlier (slide 3) we have seen that a U.D. code exists for the

N th extension of the source such that

Hn(SN ) ≤ LN < Hn(S

N ) + 1

But Hn(SN ) = NHn(S). So all we will need is to show that for a

given fixed source with fixed alphabet Huffman Code is at least as

good as the U.D. code.

Q.E.D.

Next we prove the fact we have stated without proof that, for any

fixed and given alphabet, a Huffman code is optimal.

Page 120: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Properties of an Optimal (Compact) Binary Code

Huffman Codes

Examples

Optimality Proof

• Optimality

• Necessary Conditions

• Proof

11 / 12

We only state the results in case of binary codes, even though

similar lines of arguments can be used to establish n-ary codes.

We only need to consider instantaneous codes!

1. If pi ≤ pj , then li ≥ ljProof: Otherwise switching code words will reduce L.

2. There is no single code word of length l∗ ≤ maxli.

Proof: If there were shorten it by one digit, it will still not be a

prefix of any other code word and will shorten L.

3. The code words of length l∗, they occur in pairs in which the

code words in each pair agree in all but the last digit. Proof: If

not, shorten the code word for which is not the case by one digit

and it will not be the prefix of any other code word. This will

shorten L.

Page 121: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Optimality of Binary Huffman Codes

Huffman Codes

Examples

Optimality Proof

• Optimality

• Necessary Conditions

• Proof

12 / 12

Placeholder Figure A

Lj−1 = Lj + Pα1 + Pα2 since the code at (j − 1) is same as the

code at (j) except for two words that have length one more.

We now show that if code Cj is optimal, then code Cj−1 must also

be optimal. Suppose not; and there were a better code at (j − 1).

Call it’s average length L′

j−1 < Lj−1. But the two code words with

probabilities Pα0 & Pα1 are identical in all but the last digit. Form a

new code at j that has the identical prefix as the code word for Pα.

This code will have average length L′

j = L′

j−1 − (Pα1 + Pα2) so

that L′

j−1 = Lj−1. But this can’t be the case if Cj was optimal.

Page 122: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

1 / 26

Digital Communications III (ECE 154C)

Introduction to Coding and Information Theory

Tara Javidi

These lecture notes were originally developed by late Prof. J. K. Wolf.

UC San Diego

Spring 2014

Page 123: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Coding With Distortion

Lossy Source Coding

• Lossy Coding

• Distortion

A/D & D/A

Scalar Quantization

Vector Quantization

Audio Compression

2 / 26

Page 124: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Coding with Distortion

Lossy Source Coding

• Lossy Coding

• Distortion

A/D & D/A

Scalar Quantization

Vector Quantization

Audio Compression

3 / 26

Placeholder Figure A

ǫ2 = limT→∞

1

T

∫ T/2

−T/2E(x(t)− x(t))2dt = M.S.E.

Page 125: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Discrete-Time Signals

Lossy Source Coding

• Lossy Coding

• Distortion

A/D & D/A

Scalar Quantization

Vector Quantization

Audio Compression

4 / 26

If signals are bandlimited, one can sample at nyquist rate and

convert continuous-time problem to discrete-time problem. This

sampling is part of the A/D converter.

Placeholder Figure A

ǫ2 = limm→∞

1

m

m∑

i=1

E(xi − xi)2

Page 126: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

A/D Conversion and D/A

Conversion

Lossy Source Coding

A/D & D/A

• A/D

• D/A

• D/A

• D/A

Scalar Quantization

Vector Quantization

Audio Compression

5 / 26

Page 127: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

A/D Conversion

Lossy Source Coding

A/D & D/A

• A/D

• D/A

• D/A

• D/A

Scalar Quantization

Vector Quantization

Audio Compression

6 / 26

• Assume a random variable X which falls into the range

(Xmin, Xmax).• The goal is for X to be converted into k binary digits. Let

M = 2k.

• The usual A/D converter first subdivides the interval

(Xmin, Xmax) into M equal sub-intervals.

• Subintervals are of width ∆ = (Xmax −Xmin)/M .

• Shown below for the case of k = 3 and M = 8.

Placeholder Figure A

• We call the ith sub-interval, ℜi.

Page 128: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

D/A Conversion

Lossy Source Coding

A/D & D/A

• A/D

• D/A

• D/A

• D/A

Scalar Quantization

Vector Quantization

Audio Compression

7 / 26

• Assume that if X falls in the region ℜi, i.e. x ∈ ℜi

• D/A converter uses as an estimate of X , the value X = Y which

is the center of the ith region.

• The mean-squared error between X and X is

ǫ2 = E[(X − X)2] =

∫ Xmax

Xmin

(X − X)2fx(x)dx

where fx(x) is the probability density function of the random

variable X .

• Let fX|ℜi(x) be the conditional density function of X given that

X falls in the region ℜi. Then

ǫ2 =M∑

i=1

P [x ∈ ℜi]

x∈ℜi

(x− yi)2fx|ℜi

(x)dx

Page 129: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

D/A Conversion

Lossy Source Coding

A/D & D/A

• A/D

• D/A

• D/A

• D/A

Scalar Quantization

Vector Quantization

Audio Compression

8 / 26

• Note that for i = 1, 2, ...,M

M∑

i=1

P [x ∈ ℜi] = ∆ and

x∈ℜi

fX|ℜi(x)dx = 1.

• Make the further assumption that k is large enough so that

fX|ℜi(x) is a constant over the region ℜi.

• Then fX|ℜi(x) = 1

∆ for all i, and

x∈ℜi

(x− yi)2fX|ℜi

(x)dx =1

∫ b

a(x− (

b− a

2))2dx

=1

∫ ∆2

−∆

2

(x− 0)2dx

=1

∆· 23·(

2

)3

=∆2

12

Page 130: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

D/A Conversion

Lossy Source Coding

A/D & D/A

• A/D

• D/A

• D/A

• D/A

Scalar Quantization

Vector Quantization

Audio Compression

9 / 26

• In other words, ǫ2 =M∑

i=1

P [x ∈ ℜi] ·∆2

12=

∆2

12

• If X has variance σ2x, the signal-to-noise ratio of the A to the D

(& D to A) converter is often defined as

(

σ2x/

∆2

12

)

• If Xmin is equal to −∞ and/or Xmax = +∞, then the last and

first intervals can be infinite in extent.

• However fx(x) is usually small enough in those intervals so that

the result is still approximately the same.

Placeholder Figure A

Page 131: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

SCALAR QUANTIZATION of

GAUSSIAN SAMPLES

Lossy Source Coding

A/D & D/A

Scalar Quantization

• Quantization

• Quantizer

• Optimal Quantizer

• Optimal Quantizater

• Optimal Quantizater

• Iterative Solution

• Comments

• Comments

Vector Quantization

Audio Compression

10 / 26

Page 132: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Scalar Quantization

Lossy Source Coding

A/D & D/A

Scalar Quantization

• Quantization

• Quantizer

• Optimal Quantizer

• Optimal Quantizater

• Optimal Quantizater

• Iterative Solution

• Comments

• Comments

Vector Quantization

Audio Compression

11 / 26

Placeholder Figure A

• ENCODER:

x ≤ −3b 000 0 < x < b 100−3b < x ≤ −2b 001 b < x ≤ 2b 101−2b ≤ x ≤ −b 010 2b < x ≤ 3b 110

−b ≤ x ≤ 0 011 3b < x 111

• DECODER:

000 −3.5b 100 +.5b001 −2.5b 101 +1.5b010 −1.5b 110 +2.5b011 −.5b 111 +3.5b

Page 133: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Optimum Scalar Quantizer

Lossy Source Coding

A/D & D/A

Scalar Quantization

• Quantization

• Quantizer

• Optimal Quantizer

• Optimal Quantizater

• Optimal Quantizater

• Iterative Solution

• Comments

• Comments

Vector Quantization

Audio Compression

12 / 26

• Let us construct boundaries bi, (b0 = −∞, bM = +∞ and

quantization symbols ai such that

bi−1 ≤ x < bi −→ x = ai i = 1, 2, ...,M

Placeholder Figure A

• The question is how to optimize {bi} and {ai} to minimize

distortion ǫ2

ǫ2 =

M∑

i=1

∫ bi

bi−1

(x− ai)2fx(x)dx

Page 134: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Optimum Scalar Quantizer

Lossy Source Coding

A/D & D/A

Scalar Quantization

• Quantization

• Quantizer

• Optimal Quantizer

• Optimal Quantizater

• Optimal Quantizater

• Iterative Solution

• Comments

• Comments

Vector Quantization

Audio Compression

13 / 26

• To optimize ǫ2 =∑M

i=1

∫ bibi−1

(x− ai)2fx(x)dx we take

derivatives and putting them equal to zero:

δǫ2

δaj= 0

δǫ2

δbj= 0

• And use Leibnitz’s Rule:

δ

δt

∫ b(t)

a(t)f(x, t)dx =f(b(t), t)

δb(t)

δt

− f(a(t), t)δa(t)

δt

+

∫ b(t)

a(t)

δ

δtf(x, t)dt

Page 135: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Optimum Scalar Quantizer

Lossy Source Coding

A/D & D/A

Scalar Quantization

• Quantization

• Quantizer

• Optimal Quantizer

• Optimal Quantizater

• Optimal Quantizater

• Iterative Solution

• Comments

• Comments

Vector Quantization

Audio Compression

14 / 26

δ

δbj

(

M∑

i=1

∫ bi

bi−1

(x− ai)2fx(x)dx

)

=

δ

δbj

∫ bj

bj−1

(x− aj)2fx(x)dx+

δ

δbj

∫ bj+1

bj

(x− aj+1)2fx(x)dx

= (bj − aj)2fx(x)|x=bj − (bj − aj+1)

2fx(x)|x=bj = 0

b2j − 2ajbj + a2j = b2j − 2bjaj+1 + a2j+1

2bj(aj+1 − aj) = a2j+1 − a2j

bj =aj+1+aj

2 (I)

Page 136: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Optimum Scalar Quantizer

Lossy Source Coding

A/D & D/A

Scalar Quantization

• Quantization

• Quantizer

• Optimal Quantizer

• Optimal Quantizater

• Optimal Quantizater

• Iterative Solution

• Comments

• Comments

Vector Quantization

Audio Compression

15 / 26

δ

δaj

(

M∑

i=1

bi−1

bi(x− ai)2fx(x)dx

)

= −2

∫ bj

bj−1

(x−aj)fx(x)dx = 0

aj

∫ bj

bj−1

fx(x)dx =

∫ bj

bj−1

xfx(x)dx

aj =

∫ bjbj−1

xfx(x)dx

∫ bjbj−1

fx(x)dx(II)

Page 137: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Optimum Scalar Quantizer

Lossy Source Coding

A/D & D/A

Scalar Quantization

• Quantization

• Quantizer

• Optimal Quantizer

• Optimal Quantizater

• Optimal Quantizater

• Iterative Solution

• Comments

• Comments

Vector Quantization

Audio Compression

16 / 26

• Note that the {bi} can be found from (I) once the {ai} is known.

◦ In fact, the {bi} are the midpoints of the {ai}.

• The {ai} can also be solved from (II) once the {bi} are known.

◦ The {ai} are centroids of the corresponding regions.

• Thus one can use a computer to iteratively solve for the {ai} and

the {bi}1. One starts with an initial guess for the {bi}.

2. One uses (II) to solve for the {ai}.

3. One uses (I) to solve for the {bi}.

4. One repeats steps 2 and 3 until the {ai} and the {bi} ”stop

changing”.

Page 138: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Comments on Optimum Scalar Quantizer

Lossy Source Coding

A/D & D/A

Scalar Quantization

• Quantization

• Quantizer

• Optimal Quantizer

• Optimal Quantizater

• Optimal Quantizater

• Iterative Solution

• Comments

• Comments

Vector Quantization

Audio Compression

17 / 26

1. This works for any fx(x)2. If fx(x) only has a finite support one adjusts b0&bM to be the

limits of the support.

Placeholder Figure A

3. One needs to know

β∑

α

fx(x)dx and

β∑

α

xfx(x)dx (true for

any fx(x))4. For a Gaussian, we can integrate by parts or let y = x2

∫ β

α

1√2π

e−1

2x2

dx = Q(β)−Q(α)

∫ β

αx

1√2π

e−1

2x2

dx = ...

Page 139: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Comments on Optimum Scalar Quantizer

Lossy Source Coding

A/D & D/A

Scalar Quantization

• Quantization

• Quantizer

• Optimal Quantizer

• Optimal Quantizater

• Optimal Quantizater

• Iterative Solution

• Comments

• Comments

Vector Quantization

Audio Compression

18 / 26

5. If M = 2a one could use a binary digits to represent the

quantized value. However since the quantized values are not

necessarily equally likely, one could use a HUFFMAN CODE to

use fewer binary digits(on the average).

6. After the {ai} and {bi} are known, one computes ǫ2 from

ǫ2 =M∑

i=1

∫ bi

bi−1

(x− ai)2fx(x)dx

7. For M = 2 and fx(x) =1√2πσ2

e−12

x2

σ2 we have

b0 = −∞, b1 = 0, b2 = +∞, and a2 = −a1 =

2σ2

π

8. Also easy to show that ǫ2 = (1− 2π )σ

2 = .3634σ2.

Page 140: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Vector Quantization

Lossy Source Coding

A/D & D/A

Scalar Quantization

Vector Quantization

• Quantization

• Gaussain DMS

• Gaussain DMS

Audio Compression

19 / 26

Page 141: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Vector Quantization

Lossy Source Coding

A/D & D/A

Scalar Quantization

Vector Quantization

• Quantization

• Gaussain DMS

• Gaussain DMS

Audio Compression

20 / 26

• One can achieve a smaller ǫ2 by quantizing several samples at a

time.

• We would then use regions in an m-dimensional space

Placeholder Figure A

• Shannon characterized this in terms of ”rate-distortion formula”

which tells us how small ǫ2 can be (m → ∞).

• For a Gaussian source with one binary digit per sample,

ǫ2 ≥ σ2

4= 0.25σ2

◦ This follows from the result on the next page.

◦ Contrast this with scalar case: ǫ2s = (1− 2π )σ

2 = .3634σ2.

Page 142: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

VQ: Discrete Memoryless Gaussian Source

Lossy Source Coding

A/D & D/A

Scalar Quantization

Vector Quantization

• Quantization

• Gaussain DMS

• Gaussain DMS

Audio Compression

21 / 26

• Let source produce i.i.d. Gaussian samples X1, X2, ... where

fX(x) =1√2πσ2

e−1

2x2

σ2

• Let the source encoder produce a sequence of binary digits at a

rate of R binary digits/source symbol.

◦ In our previous terminology R = logM

• Let the source decoder produce the sequence X1, X2, ..., Xi, ...such that the mean-squared error between {Xi} and {Xi} is ǫ2.

ǫ2 =1

n

n∑

i=1

E{(Xi − Xi)2}

Page 143: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

VQ: Discrete Memoryless Gaussian Source

Lossy Source Coding

A/D & D/A

Scalar Quantization

Vector Quantization

• Quantization

• Gaussain DMS

• Gaussain DMS

Audio Compression

22 / 26

• Then one can prove that for any such system

R ≥ 1

2log2(

σ2

ǫ2) for ǫ2 ≤ σ2

◦ Note that R = 0 for ǫ2 ≥ σ2. What does this mean?

◦ Note that for R = logM = 1,

1 ≥ 1

2log2(

σ2

ǫ2) ⇒ 2 ≥ log2(

σ2

ǫ2)

⇒ 4 ≥ σ2

ǫ2⇒ ǫ2 ≥ (1/4)σ2

• This is an example of “Rate-Distortion Theory.”

Page 144: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Reduced Fidelity Audio

Compression

Lossy Source Coding

A/D & D/A

Scalar Quantization

Vector Quantization

Audio Compression

• MP3

• CD

• MPEG-1 Layer 3

23 / 26

Page 145: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Reduced Fidelity Audio Compression

Lossy Source Coding

A/D & D/A

Scalar Quantization

Vector Quantization

Audio Compression

• MP3

• CD

• MPEG-1 Layer 3

24 / 26

• MP3 players use a form of audio compression called MPEG-1

Audio Layer 3.

• It takes advantage of a psycho-acoustic phenomena whereby

◦ a loud tone at one frequency “masks” the presence of softer

tones at neighboring frequencies;

◦ hence, these softer neighbouring tones need not be stored(or

transmitted).

• Compression efficiency of an audio compression scheme is

usually described by the encoded bit rate (prior to the

introduction of coding bits.)

Page 146: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Reduced Fidelity Audio Compression

Lossy Source Coding

A/D & D/A

Scalar Quantization

Vector Quantization

Audio Compression

• MP3

• CD

• MPEG-1 Layer 3

25 / 26

• The CD has a bit rate of (44.1× 103 × 2× 16) = 1.41× 106

bits/second.

◦ The term 44.1× 103 is the sampling rate which is

approximately the Nyquist frequency of the audio to be

compressed.

◦ The term 2 comes from the fact that there are two channels in

a stereo audio system.

◦ The term 16 comes from the 16-bit (or 216 = 65536 level) A

to D converter.

◦ Note that a slightly higher sampling rate 48× 103

samples/second is used for a DAT recorder.

Page 147: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Reduced Fidelity Audio Compression

Lossy Source Coding

A/D & D/A

Scalar Quantization

Vector Quantization

Audio Compression

• MP3

• CD

• MPEG-1 Layer 3

26 / 26

• Different standards are used in MP3 players.

• Several bit rates are specified in the MPEG-1, Layer 3 standard.

◦ These are 32, 40, 48, 56, 64, 80, 96, 112, 128, 144, 160,

192, 224, 256 and 320 kilobits/sec.

◦ The sampling rates allowed are 32, 44.1 and 48 kiloHz but

the sampling rate of 44.1× 103 Hz is almost always used.

• The basic idea behind the scheme is as follows.

◦ A block of 576 time domain samples are converted into 576

frequency domain samples using a DFT.

◦ The coefficients then modified using psycho-acoustic

principles.

◦ The processed coefficients are then converted into a bit

stream using various schemes including Huffman Encoding.

◦ The process is reversed at the receiver: bits −→ frequency

domain coefficients −→ time domain samples.

Page 148: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

1 / 16

Digital Communications III (ECE 154C)

Introduction to Coding and Information Theory

Tara Javidi

These lecture notes were originally developed by late Prof. J. K. Wolf.

UC San Diego

Spring 2014

Page 149: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Information Theory:

Definitions and Equalities

Basic Definition

• Conditional Entropy

• Conditional Entropy

• Mutual Information

• Basic Inequality

Examples

2 / 16

Page 150: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Basic Definitions

Basic Definition

• Conditional Entropy

• Conditional Entropy

• Mutual Information

• Basic Inequality

Examples

3 / 16

• Since H(X) =∑

x

p[x] log1

p[x]

• Then H(X,Y ) =∑

x

y

p[x, y] log1

p[x, y]

• Define H[X|Y = y] =∑

x

p(x|y) log1

p(x|y)

• Define H[X|Y ] =∑

y

p(y)H[X|Y = y]

• Easy to see

H[X|Y ] =∑

y

p(y)∑

x

p(x|y) log1

p(x|y)

=∑

x

y

p(x, y) log1

p(x|y)

Page 151: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Fundamental Equalities

Basic Definition

• Conditional Entropy

• Conditional Entropy

• Mutual Information

• Basic Inequality

Examples

4 / 16

• But p[x, y] = p[y|x]p[x] so

H[X,Y ]

=∑

x

y

p[x, y]

[

log1

p[y|x]+ log

1

p[x]

]

=∑

x

y

p[x, y] log1

p[y|x]+∑

x

(∑

y

p[x, y]

)

︸ ︷︷ ︸

p[x]

log1

p[x]

=∑

x

y

p[x, y] log1

p[y|x]︸ ︷︷ ︸

Defined as H[Y |X]

+∑

x

p[x] log1

p[x]︸ ︷︷ ︸

H[X]

• In other words

H(X,Y ) = H(X) +H(Y |X) = H(Y ) +H(X|Y )

Page 152: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Basic Definition

Basic Definition

• Conditional Entropy

• Conditional Entropy

• Mutual Information

• Basic Inequality

Examples

5 / 16

• Define I(X;Y ) =∑

x

y

p(x, y) logp(x, y)

p(x)p(y)

• It is easy to see that

1. I(X;Y ) = H(X) +H(Y )−H(X,Y )

or equivalently

H(X,Y ) = H(X) +H(Y )− I(X;Y )

2. I(X;Y ) = H(X)−H(X|Y )

3. I(X;Y ) = H(Y )−H(Y |X)

Page 153: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

A Basic Inequality

Basic Definition

• Conditional Entropy

• Conditional Entropy

• Mutual Information

• Basic Inequality

Examples

6 / 16

I(X;Y ) ≥ 0 (or −I(x; y) ≤ 0) with equality iff X & Yare statistically independent.

PROOF:

−Ia(X;Y ) =∑

x

y

p[x, y] loga

[p(x)p(y)

p(x, y)

]

= (loga e)∑

x

y

p[x, y] lnp(x)p(y)

p(x, y)

≤ (loga e) ln

(∑

x

y

p[x, y]

[p(x)p(y)

p(x, y)

])

= 0

with equality iff p[x, y] = p[x]p[y].

As a consequence:

H(X|Y ) ≤ H(X)H(Y |X) ≤ H(Y )

}

with equality iff X&Y are independent.

Page 154: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Three Examples

Basic Definition

Examples

• Example 1

• Example 2

• Example 2

(continued)

• Example 2

(continued)

• Example 3

• Example 3

(continued)

• Example 3

(continued)

• Example 3

(continued)

• Example 3

(continued)

7 / 16

Page 155: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

EXAMPLE 1

Basic Definition

Examples

• Example 1

• Example 2

• Example 2

(continued)

• Example 2

(continued)

• Example 3

• Example 3

(continued)

• Example 3

(continued)

• Example 3

(continued)

• Example 3

(continued)

8 / 16

EX 1: One die,

Xoutcome ∈ {1, 2, 3, 4, 5, 6}Y ∈ {odd, even} −→ {0, 1}

H(X) = log2 6

H(Y |X) = 0

H(X,Y ) = log2 6

H(Y ) = 1 = log2 2

H(X|Y ) = log2 3

H(X,Y ) = log2 6 = log2 6 + 0 = log2 2 + log2 3

Page 156: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

EXAMPLE 2

Basic Definition

Examples

• Example 1

• Example 2

• Example 2

(continued)

• Example 2

(continued)

• Example 3

• Example 3

(continued)

• Example 3

(continued)

• Example 3

(continued)

• Example 3

(continued)

9 / 16

EX 2: Two dice (independent throws): (X1, X2)

H(X2|X1) = H(X2) = log2 6 = 2.58496

H(X1, X2) = 2 log2 6 = log2 36 = 5.16993

Let us define

Y = X1 +X2

What is H(X1, X2|Y ) =?It is easier to calculate H(Y |X1, X2) and H(Y ) and then use

H(X1, X2|Y ) = H(Y |X1, X2) +H(X1, X2)−H(Y )

H(X1, X2, Y ) = H(Y |X1, X2) +H(X1, X2)

But Also

H(X1, X2, Y ) = H(X1, X2|Y ) +H(Y )

Page 157: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

EX 2 (continued)

Basic Definition

Examples

• Example 1

• Example 2

• Example 2

(continued)

• Example 2

(continued)

• Example 3

• Example 3

(continued)

• Example 3

(continued)

• Example 3

(continued)

• Example 3

(continued)

10 / 16

Since H(Y |X1, X2) = 0, all we need to calculate is H(Y )

1 (X1, X2)2 (1, 1) 1/363 (1, 2) + (2, 1) 2/364 (1, 3), (2, 2) + (3, 1) 3/36

5... 4/36

......

...

12 (6, 6) 1/36

H(Y ) = 2

[1

36log2 36 + · · ·+

5

36log2

36

5

]

+6

36log2 6

= 3.27441

H(X1, X2|Y ) = 5.16993− 3.27441 = 1.89552

Page 158: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

EX 2 (continued)

Basic Definition

Examples

• Example 1

• Example 2

• Example 2

(continued)

• Example 2

(continued)

• Example 3

• Example 3

(continued)

• Example 3

(continued)

• Example 3

(continued)

• Example 3

(continued)

11 / 16

Another question: What is H(X1|Y )?

We take advantage of the following equality:

H(X1, X2|Y ) = H(X1, Y ) +H(X2|X1, Y ).

Now notice that since X2 is independent of X1

H(X2|X1, Y ) = H(X2|Y ).

But by symmetry

H(X1|Y ) = H(X2|Y ).

Hence,

H(X1|Y ) =1

2H(X1, X2|Y ) = 0.94776.

Page 159: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

EXAMPLE 3

Basic Definition

Examples

• Example 1

• Example 2

• Example 2

(continued)

• Example 2

(continued)

• Example 3

• Example 3

(continued)

• Example 3

(continued)

• Example 3

(continued)

• Example 3

(continued)

12 / 16

EX 3: Assume a tennis match is played between two equally

matched players A and B and that the first player to win 3 sets is the

winner. Let X represent the outcomes of the sets. Some of the

possible values of X are: AAA, BBB, ABBB, BAAA, BABAA, etc. Let

Y be the number of sets played. Then Y takes on values 3,4, or 5.

Let Z represent the player who won the match. For example if X =

ABBAA, then Y = 5 and Z = A.

1. Compute H(X), H(Y ), H(z).2. Compute H(X,Y ), H(X,Z), H(Y, Z)3. Prove that H(X|Y ) = H(X)−H(Y ).

Page 160: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

EX 3 (continued)

Basic Definition

Examples

• Example 1

• Example 2

• Example 2

(continued)

• Example 2

(continued)

• Example 3

• Example 3

(continued)

• Example 3

(continued)

• Example 3

(continued)

• Example 3

(continued)

13 / 16

X Y Z18

18

AAA

BBB

3

3

}

28 = 1

4

A

B

116

116

116

116

116

116

BAAA

ABAA

AABA

ABBB

BABB

BBAB

4

4

4

4

4

4

616 = 3

8

A

A

A

B

B

B

Page 161: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

EX 3 (continued)

Basic Definition

Examples

• Example 1

• Example 2

• Example 2

(continued)

• Example 2

(continued)

• Example 3

• Example 3

(continued)

• Example 3

(continued)

• Example 3

(continued)

• Example 3

(continued)

14 / 16

X Y Z132

132

...

132

132

132

132

132

132

BBAAA

BABAA

...

AABBB

ABABB

ABBAB

BAABB

BABAB

BBAAB

5

5

...

5

5

5

5

5

5

1232 = 3

8

A

A

...

B

B

B

B

B

B

Page 162: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

EX 3 (continued)

Basic Definition

Examples

• Example 1

• Example 2

• Example 2

(continued)

• Example 2

(continued)

• Example 3

• Example 3

(continued)

• Example 3

(continued)

• Example 3

(continued)

• Example 3

(continued)

15 / 16

1. H(X) = 2

[1

8log 8

]

+ 6

[1

16log 16

]

+ 12

[1

32log 32

]

=

6

8+

3

2+

15

8=

33

8= 4.125 (base 2)

H(Y ) =1

4log 4 + 2

(3

8log

8

3

)

=1

2+

9

4−

3

4log 3 =

11

4−

3

4log 3 = 1.56 (base 2)

H(Z) = log 2 = 1 (base 2)

2. H(X,Y ) = H(X) +H(Y |X)︸ ︷︷ ︸

0

= H(X) = 338 = 4.125

H(X,Z) = H(X) +H(Z|X)︸ ︷︷ ︸

0

= H(X) = 338 = 4.125

H(Y, Z) = H(Y ) +H(Z|Y ) = H(Y ) +H(Z) =154 − 3

4 log 3 = 2.56

Page 163: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

EX 3 (continued)

Basic Definition

Examples

• Example 1

• Example 2

• Example 2

(continued)

• Example 2

(continued)

• Example 3

• Example 3

(continued)

• Example 3

(continued)

• Example 3

(continued)

• Example 3

(continued)

16 / 16

3. H(X|Y ) = H(X,Y )︸ ︷︷ ︸

H(X)︸ ︷︷ ︸

+H(Y |X)︸ ︷︷ ︸

0

−H(Y ) = H(X)−H(Y ) =

2.57 .

Similarly,

H(Y |X) = H(Z|X) = 0

H(Z|Y ) = H(Y, Z)−H(Y ) = 2.56− 1.56 = 1 = H(Z)

H(Z|X) = 1 = H(Z|X,Y ) = H(Z|X) = 1

H(X|Z) = H(X,Z)−H(Z) = H(X)−H(Z) = 3.125

H(X,Y, Z) = H(Z|XY )︸ ︷︷ ︸

1

+H(X,Y )︸ ︷︷ ︸

4.25

= 5.25 etc..

Page 164: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

1 / 23

Digital Communications III (ECE 154C)

Introduction to Coding and Information Theory

Tara Javidi

These lecture notes were originally developed by late Prof. J. K. Wolf.

UC San Diego

Spring 2014

Page 165: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Discrete Memoryless Channels

Discrete Memoryless

Channel

• Basic Definition

Capacity of DMC

Channel Coding

Capacity: Examples

Computing Capacity

2 / 23

Page 166: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Discrete Memoryless Channel

Discrete Memoryless

Channel

• Basic Definition

Capacity of DMC

Channel Coding

Capacity: Examples

Computing Capacity

3 / 23

X1...XN−→ DMCY1...YN−→

PY1...YN |X1...XN(y1...yN |x1...xN)

= PY |X(y1|x1)PY |X(y2|x2)PY |X(yN |xN )

From the memoryless property of the chancel, the “Single Input –

Single Output” representation is sufficiently informative.

X−→ DMC

Y−→

PY |X(y|x)

Page 167: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Channel Capacity:

Fundamental Limits

Discrete Memoryless

Channel

Capacity of DMC

• Basica Defintion

• Intuition

• Block Input

Channel Coding

Capacity: Examples

Computing Capacity

4 / 23

Page 168: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Maximum Mutual Information

Discrete Memoryless

Channel

Capacity of DMC

• Basica Defintion

• Intuition

• Block Input

Channel Coding

Capacity: Examples

Computing Capacity

5 / 23

=⇒ An important property of a DMC is its mutual information,

I(X;Y )

=⇒ But in order to calculate I(X;Y ) we need to know

PX,Y (x, y) = PY |X(y|x)PX(x)

=⇒ Thus, in order to calculate I(X;Y ) one has to specify an input

distribution PX(x)

=⇒ The capacity, C , of a DMC is the maximum I(X;Y ) that can be

achieved over all input distributions

C = maxPX(x)

I(X;Y )

Page 169: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Channel Capacity: Intuitition

Discrete Memoryless

Channel

Capacity of DMC

• Basica Defintion

• Intuition

• Block Input

Channel Coding

Capacity: Examples

Computing Capacity

6 / 23

• Recall that I(X;Y ) = H(X)−H(X/Y )• Recall the interpretation that

◦ For any random variable X , H(X) measure the randomness

or uncertainty about X◦ Zy = X|Y=y as a new random variable with (conditional)

pmf PX|Y=y(x|Y = y)◦ H(X|Y ) is the entropy of this random variable averaged

over choices of Y = y

• Mutual Information is nothing but the average reduction in the

randomness about X after Y is observed

• Channel Capacity is the maximum such reduction in uncertainty

when one can design X

C = maxPX (x)

I(X;Y )

Page 170: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Multiple Input – Multiple Output

Discrete Memoryless

Channel

Capacity of DMC

• Basica Defintion

• Intuition

• Block Input

Channel Coding

Capacity: Examples

Computing Capacity

7 / 23

=⇒ One can show that for a DMC

maxPX1···XN

(x1,...,xN )

1

NI(X1, . . . , XN ;Y1, . . . , YN ) = max

PX(x)I(X;Y )

=⇒ How?

Page 171: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Channel Coding and Capacity

Discrete Memoryless

Channel

Capacity of DMC

Channel Coding

• Definitions

• Definitions

• Coding Theorem

Capacity: Examples

Computing Capacity

8 / 23

Page 172: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Coding For A Binary Input DMC

Discrete Memoryless

Channel

Capacity of DMC

Channel Coding

• Definitions

• Definitions

• Coding Theorem

Capacity: Examples

Computing Capacity

9 / 23

W−→

Channel

Encoder

m−vector

︷︸︸︷

x−−−−−−→

Binary-Input

DMC

m−vector

︷︸︸︷

y

−−−−−−→Channel

Decoder

W−→

• Code Rate: R < 1 (at most one digit transmitted)

• Message: An integer number, denoted by W, between 1 and

2mR (equivalently, a binary vector of length mR)

Assumption: Messages occur with equal probabilities

P [W = i] =1

2mR, for all i = 1, 2, ..., 2mR

Page 173: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Coding For A Binary Input DMC

Discrete Memoryless

Channel

Capacity of DMC

Channel Coding

• Definitions

• Definitions

• Coding Theorem

Capacity: Examples

Computing Capacity

10 / 23

W−→

Channel

Encoder

m−vector

︷︸︸︷

x−−−−−−→

Binary-Input

DMC

m−vector

︷︸︸︷

y

−−−−−−→Channel

Decoder

W−→

• Block Code (Binary): Collection of 2mR length m binary vectors

• Channel Decoder: Chooses most likely code word (or

equivalently most likely message) based upon received vector y.

• Error Probability: Probability that message produced by decoder

is not equal to the original message, i.e. P{W 6= W}

Page 174: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Channel Coding Theorem

Discrete Memoryless

Channel

Capacity of DMC

Channel Coding

• Definitions

• Definitions

• Coding Theorem

Capacity: Examples

Computing Capacity

11 / 23

W−→

Channel

Encoder

m−vector

︷︸︸︷

x−−−−−−→

Binary-Input

DMC

m−vector

︷︸︸︷

y

−−−−−−→Channel

Decoder

W−→

Channel Coding Theorem:

Let R < C (base 2) . For m large enough, there exists an encoder

and a decoder such that P{W 6= W} < ǫ for any ǫ > 0.

Page 175: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Computation of Capacity for

Well-known Channels

Discrete Memoryless

Channel

Capacity of DMC

Channel Coding

Capacity: Examples

• BSC

• BSC

• BEC

• BEC

• Z-Channel

• Z-Channel

• 5-ary Symmetric

• 5-ary Symmetric

• Code for 5-ary

Symmetric

Computing Capacity

12 / 23

Page 176: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Capacity of BSC

Discrete Memoryless

Channel

Capacity of DMC

Channel Coding

Capacity: Examples

• BSC

• BSC

• BEC

• BEC

• Z-Channel

• Z-Channel

• 5-ary Symmetric

• 5-ary Symmetric

• Code for 5-ary

Symmetric

Computing Capacity

13 / 23

EXAMPLE: Binary Symmetric Channel (BSC)

Placeholder Figure A

I2(X;Y ) = H2(Y )−H2(Y |X)

H2(Y |X) =1∑

x=0

1∑

y=0

PX(x)PY |X(y|x) log21

PY |X(y|x)

=

1∑

x=0

PX(x)

(

(1− p) log21

1− p+ p log2

1

p

)

= h2(p)

where h2(p) := (1− p) log21

1−p + p log21p .

Page 177: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Capacity of BSC

Discrete Memoryless

Channel

Capacity of DMC

Channel Coding

Capacity: Examples

• BSC

• BSC

• BEC

• BEC

• Z-Channel

• Z-Channel

• 5-ary Symmetric

• 5-ary Symmetric

• Code for 5-ary

Symmetric

Computing Capacity

14 / 23

EXAMPLE: Binary Symmetric Channel (BSC)

On the other hand, H2(Y ) ≤ log2 2 with equality iff

P [Y = 0] = P [Y = 1] = 1/2.

But note that if PX(0) = 1/2 = PX(1) = 1/2, then

PY (0) = PY (1) = 1/2.

In other words,

C = maxPX (x)

I(X;Y ) = log2 2− h2(p) = 1− h2(p).

Page 178: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Capacity of BEC

Discrete Memoryless

Channel

Capacity of DMC

Channel Coding

Capacity: Examples

• BSC

• BSC

• BEC

• BEC

• Z-Channel

• Z-Channel

• 5-ary Symmetric

• 5-ary Symmetric

• Code for 5-ary

Symmetric

Computing Capacity

15 / 23

EXAMPLE: Binary Erasure Channel (BEC)

Placeholder Figure A

I2(X;Y ) = H2(X)−H(X|Y )

Assume P [X = 0] = α, and P [X = 1] = (1− α)(⇒ PY (2) = p, PY (0) = α(1− p), and PY [1] = (1−α)(1− p)).

Hence, H2(X) = α log21α + (1− α) log2

11−α = h2(α).

On the other hand,

H2(X|Y ) =∑2

y=0 PY (y)1∑

x=0

PX|Y (x|y) log21

PX|Y (x|y)︸ ︷︷ ︸

H(X|Y=y)

Page 179: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Capacity of BEC

Discrete Memoryless

Channel

Capacity of DMC

Channel Coding

Capacity: Examples

• BSC

• BSC

• BEC

• BEC

• Z-Channel

• Z-Channel

• 5-ary Symmetric

• 5-ary Symmetric

• Code for 5-ary

Symmetric

Computing Capacity

16 / 23

Also,

H(X|Y = 0) =1∑

x=0

PX|Y (x|0) log21

PX|Y (x|0)= 0

H(X|Y = 1) =1∑

x=0

PX|Y (x|1) log21

PX|Y (x|1)= 0

H(X|Y = 2) =1∑

x=0

PX|Y (x|2) log21

PX|Y (x|2)

= α log2(1

α) + (1− α) log2

1

1− α= h2(α)

Hence,

C = maxα

h2(α)− ph2(α) = 1− p.

Page 180: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Capacity of a Z-Channel

Discrete Memoryless

Channel

Capacity of DMC

Channel Coding

Capacity: Examples

• BSC

• BSC

• BEC

• BEC

• Z-Channel

• Z-Channel

• 5-ary Symmetric

• 5-ary Symmetric

• Code for 5-ary

Symmetric

Computing Capacity

17 / 23

EXAMPLE:

Placeholder Figure A

Recall I2(X;Y ) = H2(X)−H(X|Y ).

Assume P [X = 0] = α, and P [X = 1] = (1− α).

Hence, H2(X) = α log21α + (1− α) log2

11−α = h2(α).

Also, H2(Y |X) = αh2(12). So

I2(X;Y ) = H2(Y )−H2(Y |X) = h2(α

2)− αh2(

1

2)

2log2

2

α+ (1−

α

2) log2

1

1− α2

− αh2(1

2)

Page 181: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Capacity of a Z-Channel

Discrete Memoryless

Channel

Capacity of DMC

Channel Coding

Capacity: Examples

• BSC

• BSC

• BEC

• BEC

• Z-Channel

• Z-Channel

• 5-ary Symmetric

• 5-ary Symmetric

• Code for 5-ary

Symmetric

Computing Capacity

18 / 23

Equivalently Ie(x; y) =α2 ln 2

α(1−α2 ) ln

11−α

2

− αhe(12).

We maximize Ie(X;Y ) over α (same α maximizes I2(X;Y )).

0 =δIe(X;Y )

δα=

1

2ln

2

α−

α

2.1

α+

(

1−α

2

) 1/2(1− α

2

) − he(1

2)

︸ ︷︷ ︸

ln 2

=1

2ln 2−

1

2lnα−

1

2+

1

2+

1

2ln

(

1−α

2

)

− ln 2

This means

1

2ln

α

1− α2

= −1

2ln 2 =

1

2ln

1

2⇒

1− α2

α= 2 =⇒ α =

2

5

C2 = I2(X;Y )|α=2/5 =1

5log2 5 +

4

5log2

5

4−

2

5= log2 5− 2 = .3219

Page 182: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Capacity of a 5-ary Symmetric Channel

Discrete Memoryless

Channel

Capacity of DMC

Channel Coding

Capacity: Examples

• BSC

• BSC

• BEC

• BEC

• Z-Channel

• Z-Channel

• 5-ary Symmetric

• 5-ary Symmetric

• Code for 5-ary

Symmetric

Computing Capacity

19 / 23

EXAMPLE:

Placeholder Figure A

C = maxP (X)

I(X;Y ) = I(X;Y )|P [X=x]= 1

5,x=0,1,2,3,4

= [H(Y )−H(Y |X)]P [X=x]= 1

5,x=0,1,2,3,4 .

If inputs are equally likely, outputs are equally likely. Thus,

H(Y ) = log 5.

So what remains is to compute H(Y |X)|P [X=x]= 1

5,x=0,1,2,3,4.

Page 183: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Capacity of a 5-ary Symmetric Channel

Discrete Memoryless

Channel

Capacity of DMC

Channel Coding

Capacity: Examples

• BSC

• BSC

• BEC

• BEC

• Z-Channel

• Z-Channel

• 5-ary Symmetric

• 5-ary Symmetric

• Code for 5-ary

Symmetric

Computing Capacity

20 / 23

For any value of the input, there are two possible outputs that are

equally likely. In other words,

H(Y |X = x) = log 2, x = 0, . . . , 4.

Hence,

H(Y |X) = log 2.

And

C = log 5− log 2 = log5

2

Note: one can also measure the capacity in terms of 5-art digits

where capacity would be nothing but C5 = log552 .

Page 184: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

A simple Code for 5-ary Symmetric Channel

Discrete Memoryless

Channel

Capacity of DMC

Channel Coding

Capacity: Examples

• BSC

• BSC

• BEC

• BEC

• Z-Channel

• Z-Channel

• 5-ary Symmetric

• 5-ary Symmetric

• Code for 5-ary

Symmetric

Computing Capacity

21 / 23

Code Words Outputs

00 −→ 00 01 10 1112 −→ 12 13 22 2324 −→ 24 20 34 3031 −→ 31 32 41 4243 −→ 43 44 03 04

Decoder works in reverse (and never gets confused!).

There are 5 code words and each code word can represent one

5-ary digit or log2 5 binary digits. Since each code word represents

2 uses of the channel the rate of the code is 12 = .5 5-ary digits

(12 log2 5 = 1.1609 binary digits) per channel use.

Now note that this rate is below the capacity of the channel:12 log2 5 = 1.11609 ≤ log 5

2 = 1.322 (or equivalently 12 ≤ log5

52 ).

Page 185: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Computation of Capacity

Discrete Memoryless

Channel

Capacity of DMC

Channel Coding

Capacity: Examples

Computing Capacity

• Blahut-Arimoto

22 / 23

Page 186: Digital Communications III (ECE 154C) Introduction …acsweb.ucsd.edu/~afazelic/ece154c/LectureNotes...1 / 8 Digital Communications III (ECE 154C) Introduction to Coding and Information

Finding the Capacity Using A Computer

Discrete Memoryless

Channel

Capacity of DMC

Channel Coding

Capacity: Examples

Computing Capacity

• Blahut-Arimoto

23 / 23

• For an arbitrary DMC, one cannot, in general, use analytic

techniques to find the channel capacity.

• Instead of one can use a computer to search out the input

distribution, PX(x), that maximizes I(X;Y )• These is a special algorithm, called the BLAHUT-ARIMOTO

algorithm, for doing this. For simple cases, a brute-force search

can be used.

EXAMPLE:

Placeholder Figure A

1. Use computer to evaluate I(X;Y ) = f(α)2. Optimize with respect to α