165
Capacity of Permutation Channels Anuran Makur Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology 7 October 2020 Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 1 / 39

Capacity of Permutation Channels - cs.purdue.edu

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Capacity of Permutation Channels - cs.purdue.edu

Capacity of Permutation Channels

Anuran Makur

Department of Electrical Engineering and Computer ScienceMassachusetts Institute of Technology

7 October 2020

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 1 / 39

Page 2: Capacity of Permutation Channels - cs.purdue.edu

Outline

1 IntroductionThree MotivationsPermutation Channel ModelInformation CapacityExample: Binary Symmetric Channel

2 Achievability and Converse for the BSC

3 General Achievability Bound

4 General Converse Bounds

5 Conclusion

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 2 / 39

Page 3: Capacity of Permutation Channels - cs.purdue.edu

Three Motivations

Coding theory: [DG01], [Mit06], [Met09], [KV15], [KT18], . . .

Communication networks: [XZ02], [WWM09], [GG10], [KV13], . . .

Molecular/Biological Communications: [YKGR+15], [KPM16], [HSRT17], [SH19], . . .

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 3 / 39

Page 4: Capacity of Permutation Channels - cs.purdue.edu

Three Motivations

Coding theory: [DG01], [Mit06], [Met09], [KV15], [KT18], . . .Random deletion channel: LDPC codes nearly achieve capacity for large alphabetsCodes correct for transpositions of symbols

Communication networks: [XZ02], [WWM09], [GG10], [KV13], . . .

Molecular/Biological Communications: [YKGR+15], [KPM16], [HSRT17], [SH19], . . .

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 3 / 39

Page 5: Capacity of Permutation Channels - cs.purdue.edu

Three Motivations

Coding theory: [DG01], [Mit06], [Met09], [KV15], [KT18], . . .Random deletion channel: LDPC codes nearly achieve capacity for large alphabetsCodes correct for transpositions of symbolsPermutation channels with insertions, deletions, substitutions, or erasuresConstruction and analysis of multiset codes

Communication networks: [XZ02], [WWM09], [GG10], [KV13], . . .

Molecular/Biological Communications: [YKGR+15], [KPM16], [HSRT17], [SH19], . . .

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 3 / 39

Page 6: Capacity of Permutation Channels - cs.purdue.edu

Three Motivations

Coding theory: [DG01], [Mit06], [Met09], [KV15], [KT18], . . .Random deletion channel: LDPC codes nearly achieve capacity for large alphabetsCodes correct for transpositions of symbolsPermutation channels with insertions, deletions, substitutions, or erasuresConstruction and analysis of multiset codes

Communication networks: [XZ02], [WWM09], [GG10], [KV13], . . .Mobile ad hoc networks, multipath routed networks, etc.

Molecular/Biological Communications: [YKGR+15], [KPM16], [HSRT17], [SH19], . . .

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 3 / 39

Page 7: Capacity of Permutation Channels - cs.purdue.edu

Three Motivations

Coding theory: [DG01], [Mit06], [Met09], [KV15], [KT18], . . .Random deletion channel: LDPC codes nearly achieve capacity for large alphabetsCodes correct for transpositions of symbolsPermutation channels with insertions, deletions, substitutions, or erasuresConstruction and analysis of multiset codes

Communication networks: [XZ02], [WWM09], [GG10], [KV13], . . .Mobile ad hoc networks, multipath routed networks, etc.Out-of-order delivery of packetsCorrect for packet errors/losses when packets do not have sequence numbers

Molecular/Biological Communications: [YKGR+15], [KPM16], [HSRT17], [SH19], . . .

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 3 / 39

Page 8: Capacity of Permutation Channels - cs.purdue.edu

Three Motivations

Coding theory: [DG01], [Mit06], [Met09], [KV15], [KT18], . . .Random deletion channel: LDPC codes nearly achieve capacity for large alphabetsCodes correct for transpositions of symbolsPermutation channels with insertions, deletions, substitutions, or erasuresConstruction and analysis of multiset codes

Communication networks: [XZ02], [WWM09], [GG10], [KV13], . . .Mobile ad hoc networks, multipath routed networks, etc.Out-of-order delivery of packetsCorrect for packet errors/losses when packets do not have sequence numbers

Molecular/Biological Communications: [YKGR+15], [KPM16], [HSRT17], [SH19], . . .

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 3 / 39

Page 9: Capacity of Permutation Channels - cs.purdue.edu

Three Motivations

Coding theory: [DG01], [Mit06], [Met09], [KV15], [KT18], . . .Random deletion channel: LDPC codes nearly achieve capacity for large alphabetsCodes correct for transpositions of symbolsPermutation channels with insertions, deletions, substitutions, or erasuresConstruction and analysis of multiset codes

Communication networks: [XZ02], [WWM09], [GG10], [KV13], . . .Mobile ad hoc networks, multipath routed networks, etc.Out-of-order delivery of packetsCorrect for packet errors/losses when packets do not have sequence numbers

Molecular/Biological Communications: [YKGR+15], [KPM16], [HSRT17], [SH19], . . .DNA based storage systemsSource data encoded into DNA molecules

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 3 / 39

Page 10: Capacity of Permutation Channels - cs.purdue.edu

Three Motivations

Coding theory: [DG01], [Mit06], [Met09], [KV15], [KT18], . . .Random deletion channel: LDPC codes nearly achieve capacity for large alphabetsCodes correct for transpositions of symbolsPermutation channels with insertions, deletions, substitutions, or erasuresConstruction and analysis of multiset codes

Communication networks: [XZ02], [WWM09], [GG10], [KV13], . . .Mobile ad hoc networks, multipath routed networks, etc.Out-of-order delivery of packetsCorrect for packet errors/losses when packets do not have sequence numbers

Molecular/Biological Communications: [YKGR+15], [KPM16], [HSRT17], [SH19], . . .DNA based storage systemsSource data encoded into DNA moleculesFragments of DNA molecules cachedReceiver reads encoded data by shotgun sequencing (i.e., random sampling)

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 3 / 39

Page 11: Capacity of Permutation Channels - cs.purdue.edu

Three Motivations

Coding theory: [DG01], [Mit06], [Met09], [KV15], [KT18], . . .Random deletion channel: LDPC codes nearly achieve capacity for large alphabetsCodes correct for transpositions of symbolsPermutation channels with insertions, deletions, substitutions, or erasuresConstruction and analysis of multiset codes

Communication networks: [XZ02], [WWM09], [GG10], [KV13], . . .Mobile ad hoc networks, multipath routed networks, etc.Out-of-order delivery of packetsCorrect for packet errors/losses when packets do not have sequence numbers

Molecular/Biological Communications: [YKGR+15], [KPM16], [HSRT17], [SH19], . . .DNA based storage systemsSource data encoded into DNA moleculesFragments of DNA molecules cachedReceiver reads encoded data by shotgun sequencing (i.e., random sampling)

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 3 / 39

Page 12: Capacity of Permutation Channels - cs.purdue.edu

Motivation: Point-to-point Communication in Packet Networks

NETWORK

SENDER RECEIVER

Model communication network as a channel:

Alphabet symbols = all possible b-bit packetsMultipath routed networkPackets are impaired

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 4 / 39

Page 13: Capacity of Permutation Channels - cs.purdue.edu

Motivation: Point-to-point Communication in Packet Networks

NETWORK

SENDER RECEIVER

Model communication network as a channel

:

Alphabet symbols = all possible b-bit packetsMultipath routed networkPackets are impaired

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 4 / 39

Page 14: Capacity of Permutation Channels - cs.purdue.edu

Motivation: Point-to-point Communication in Packet Networks

NETWORK

SENDER RECEIVER

Model communication network as a channel:

Alphabet symbols = all possible b-bit packets ⇒ 2b input symbols

Multipath routed networkPackets are impaired

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 4 / 39

Page 15: Capacity of Permutation Channels - cs.purdue.edu

Motivation: Point-to-point Communication in Packet Networks

NETWORK

SENDER RECEIVER

Model communication network as a channel:

Alphabet symbols = all possible b-bit packetsMultipath routed network or evolving network topology

Packets are impaired

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 4 / 39

Page 16: Capacity of Permutation Channels - cs.purdue.edu

Motivation: Point-to-point Communication in Packet Networks

NETWORK

SENDER RECEIVER

Model communication network as a channel:

Alphabet symbols = all possible b-bit packetsMultipath routed network ⇒ packets received with transpositions

Packets are impaired

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 4 / 39

Page 17: Capacity of Permutation Channels - cs.purdue.edu

Motivation: Point-to-point Communication in Packet Networks

NETWORK

SENDER RECEIVER

Model communication network as a channel:

Alphabet symbols = all possible b-bit packetsMultipath routed network ⇒ packets received with transpositionsPackets are impaired (e.g., deletions, substitutions, etc.)

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 4 / 39

Page 18: Capacity of Permutation Channels - cs.purdue.edu

Motivation: Point-to-point Communication in Packet Networks

NETWORK

SENDER RECEIVER

Model communication network as a channel:

Alphabet symbols = all possible b-bit packetsMultipath routed network ⇒ packets received with transpositionsPackets are impaired ⇒ model using channel probabilities

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 4 / 39

Page 19: Capacity of Permutation Channels - cs.purdue.edu

Example: Coding for Random Deletion Network

Consider a communication network where packets can be dropped:

NETWORK

SENDER RECEIVER

Abstraction:

n-length codeword = sequence of n packets:Random permutation block: Randomly permute packets of codeword

How do you code in such channels without increasing alphabet size?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 5 / 39

Page 20: Capacity of Permutation Channels - cs.purdue.edu

Example: Coding for Random Deletion Network

Consider a communication network where packets can be dropped:

NETWORK

SENDER RECEIVER

RANDOM DELETION

RANDOM PERMUTATION

Abstraction:

n-length codeword = sequence of n packets

:Random permutation block: Randomly permute packets of codeword

How do you code in such channels without increasing alphabet size?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 5 / 39

Page 21: Capacity of Permutation Channels - cs.purdue.edu

Example: Coding for Random Deletion Network

Consider a communication network where packets can be dropped:

NETWORK

SENDER RECEIVER

RANDOM DELETION

RANDOM PERMUTATION

Abstraction:

n-length codeword = sequence of n packetsRandom deletion channel: Delete each symbol/packet independently with prob p ∈ (0, 1)

Random permutation block: Randomly permute packets of codeword

How do you code in such channels without increasing alphabet size?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 5 / 39

Page 22: Capacity of Permutation Channels - cs.purdue.edu

Example: Coding for Random Deletion Network

Consider a communication network where packets can be dropped:

NETWORK

SENDER RECEIVER

RANDOM DELETION

RANDOM PERMUTATION

Abstraction:

n-length codeword = sequence of n packetsRandom deletion channel: Delete each symbol/packet independently with prob p ∈ (0, 1)

Random permutation block: Randomly permute packets of codeword

How do you code in such channels without increasing alphabet size?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 5 / 39

Page 23: Capacity of Permutation Channels - cs.purdue.edu

Example: Coding for Random Deletion Network

Consider a communication network where packets can be dropped:

NETWORK

SENDER RECEIVER

RANDOM DELETION

RANDOM PERMUTATION

Abstraction:

n-length codeword = sequence of n packetsRandom deletion channel: Delete each symbol/packet independently with prob p ∈ (0, 1)Random permutation block: Randomly permute packets of codeword

How do you code in such channels without increasing alphabet size?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 5 / 39

Page 24: Capacity of Permutation Channels - cs.purdue.edu

Example: Coding for Random Deletion Network

Consider a communication network where packets can be dropped:

NETWORK

SENDER RECEIVER

RANDOM DELETION

RANDOM PERMUTATION

Abstraction:

n-length codeword = sequence of n packetsRandom deletion channel: Delete each symbol/packet independently with prob p ∈ (0, 1)Random permutation block: Randomly permute packets of codeword

How do you code in such channels without increasing alphabet size?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 5 / 39

Page 25: Capacity of Permutation Channels - cs.purdue.edu

Example: Coding for Random Deletion Network

Consider a communication network where packets can be dropped:

NETWORK

SENDER RECEIVER

ERASURE CHANNEL

RANDOM PERMUTATION?

?

Abstraction:

n-length codeword = sequence of n packetsEquivalent Erasure channel: Erase each symbol/packet independently with prob p ∈ (0, 1)Random permutation block: Randomly permute packets of codeword

How do you code in such channels without increasing alphabet size?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 5 / 39

Page 26: Capacity of Permutation Channels - cs.purdue.edu

Example: Coding for Random Deletion Network

Consider a communication network where packets can be dropped:

NETWORK

SENDER RECEIVER

ERASURE CHANNEL

RANDOM PERMUTATION?

?1 2 33

31

1

Abstraction:

n-length codeword = sequence of n packetsErasure channel: Erase each symbol/packet independently with prob p ∈ (0, 1)Random permutation block: Randomly permute packets of codewordCoding: Add sequence numbers (packet size = b + log(n) bits, alphabet size = n 2b)

More refined coding techniques simulate sequence numbers [Mit06], [Met09]

How do you code in such channels without increasing alphabet size?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 5 / 39

Page 27: Capacity of Permutation Channels - cs.purdue.edu

Example: Coding for Random Deletion Network

Consider a communication network where packets can be dropped:

NETWORK

SENDER RECEIVER

ERASURE CHANNEL

RANDOM PERMUTATION?

?1 2 33

31

1

Abstraction:

n-length codeword = sequence of n packetsErasure channel: Erase each symbol/packet independently with prob p ∈ (0, 1)Random permutation block: Randomly permute packets of codewordCoding: Add sequence numbers and use standard coding techniques

More refined coding techniques simulate sequence numbers [Mit06], [Met09]

How do you code in such channels without increasing alphabet size?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 5 / 39

Page 28: Capacity of Permutation Channels - cs.purdue.edu

Example: Coding for Random Deletion Network

Consider a communication network where packets can be dropped:

NETWORK

SENDER RECEIVER

ERASURE CHANNEL

RANDOM PERMUTATION?

?1 2 33

31

1

Abstraction:

n-length codeword = sequence of n packetsErasure channel: Erase each symbol/packet independently with prob p ∈ (0, 1)Random permutation block: Randomly permute packets of codewordCoding: Add sequence numbers and use standard coding techniquesMore refined coding techniques simulate sequence numbers [Mit06], [Met09]

How do you code in such channels without increasing alphabet size?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 5 / 39

Page 29: Capacity of Permutation Channels - cs.purdue.edu

Example: Coding for Random Deletion Network

Consider a communication network where packets can be dropped:

NETWORK

SENDER RECEIVER

ERASURE CHANNEL

RANDOM PERMUTATION?

?

Abstraction:

n-length codeword = sequence of n packetsErasure channel: Erase each symbol/packet independently with prob p ∈ (0, 1)Random permutation block: Randomly permute packets of codeword

How do you code in such channels without increasing alphabet size?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 5 / 39

Page 30: Capacity of Permutation Channels - cs.purdue.edu

Permutation Channel Model

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Sender sends message M ∼ Uniform(M)

n = blocklength

Randomized encoder fn :M→ X n produces codeword X n1 = (X1, . . . ,Xn) = fn(M)

Discrete memoryless channel PZ |X with input & output alphabets X & Y produces Zn1 :

PZn1 |X n

1(zn1 |xn1 ) =

n∏i=1

PZ |X (zi |xi )

Random permutation π generates Y n1 from Zn

1 : Yπ(i) = Zi for i ∈ {1, . . . , n}Randomized decoder gn : Yn →M∪ {error} produces estimate M = gn(Y n

1 ) at receiver

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 6 / 39

Page 31: Capacity of Permutation Channels - cs.purdue.edu

Permutation Channel Model

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Sender sends message M ∼ Uniform(M)

n = blocklength

Randomized encoder fn :M→ X n produces codeword X n1 = (X1, . . . ,Xn) = fn(M)

Discrete memoryless channel PZ |X with input & output alphabets X & Y produces Zn1 :

PZn1 |X n

1(zn1 |xn1 ) =

n∏i=1

PZ |X (zi |xi )

Random permutation π generates Y n1 from Zn

1 : Yπ(i) = Zi for i ∈ {1, . . . , n}Randomized decoder gn : Yn →M∪ {error} produces estimate M = gn(Y n

1 ) at receiver

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 6 / 39

Page 32: Capacity of Permutation Channels - cs.purdue.edu

Permutation Channel Model

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Sender sends message M ∼ Uniform(M)

n = blocklength

Randomized encoder fn :M→ X n produces codeword X n1 = (X1, . . . ,Xn) = fn(M)

Discrete memoryless channel PZ |X with input & output alphabets X & Y produces Zn1 :

PZn1 |X n

1(zn1 |xn1 ) =

n∏i=1

PZ |X (zi |xi )

Random permutation π generates Y n1 from Zn

1 : Yπ(i) = Zi for i ∈ {1, . . . , n}Randomized decoder gn : Yn →M∪ {error} produces estimate M = gn(Y n

1 ) at receiver

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 6 / 39

Page 33: Capacity of Permutation Channels - cs.purdue.edu

Permutation Channel Model

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Sender sends message M ∼ Uniform(M)

n = blocklength

Randomized encoder fn :M→ X n produces codeword X n1 = (X1, . . . ,Xn) = fn(M)

Discrete memoryless channel PZ |X with input & output alphabets X & Y produces Zn1 :

PZn1 |X n

1(zn1 |xn1 ) =

n∏i=1

PZ |X (zi |xi )

Random permutation π generates Y n1 from Zn

1 : Yπ(i) = Zi for i ∈ {1, . . . , n}

Randomized decoder gn : Yn →M∪ {error} produces estimate M = gn(Y n1 ) at receiver

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 6 / 39

Page 34: Capacity of Permutation Channels - cs.purdue.edu

Permutation Channel Model

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Sender sends message M ∼ Uniform(M)

n = blocklength

Randomized encoder fn :M→ X n produces codeword X n1 = (X1, . . . ,Xn) = fn(M)

Discrete memoryless channel PZ |X with input & output alphabets X & Y produces Zn1 :

PZn1 |X n

1(zn1 |xn1 ) =

n∏i=1

PZ |X (zi |xi )

Random permutation π generates Y n1 from Zn

1 : Yπ(i) = Zi for i ∈ {1, . . . , n}Randomized decoder gn : Yn →M∪ {error} produces estimate M = gn(Y n

1 ) at receiver

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 6 / 39

Page 35: Capacity of Permutation Channels - cs.purdue.edu

Permutation Channel Model

What if we analyze the “swapped” model?

ENCODER CHANNELRANDOM PERMUTATION DECODER

𝑀 𝑋 𝑉 𝑊 𝑀

Proposition (Equivalent Models)

If channel PW |V is equal to channel PZ |X , then channel PW n1 |X n

1is equal to channel PY n

1 |X n1

.

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Remarks:

Proof follows from direct calculation.

Can analyze either model!

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 7 / 39

Page 36: Capacity of Permutation Channels - cs.purdue.edu

Permutation Channel Model

What if we analyze the “swapped” model?

ENCODER CHANNELRANDOM PERMUTATION DECODER

𝑀 𝑋 𝑉 𝑊 𝑀

Proposition (Equivalent Models)

If channel PW |V is equal to channel PZ |X , then channel PW n1 |X n

1is equal to channel PY n

1 |X n1

.

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Remarks:

Proof follows from direct calculation.

Can analyze either model!

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 7 / 39

Page 37: Capacity of Permutation Channels - cs.purdue.edu

Permutation Channel Model

What if we analyze the “swapped” model?

ENCODER CHANNELRANDOM PERMUTATION DECODER

𝑀 𝑋 𝑉 𝑊 𝑀

Proposition (Equivalent Models)

If channel PW |V is equal to channel PZ |X , then channel PW n1 |X n

1is equal to channel PY n

1 |X n1

.

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Remarks:

Proof follows from direct calculation.

Can analyze either model!

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 7 / 39

Page 38: Capacity of Permutation Channels - cs.purdue.edu

Permutation Channel Model

What if we analyze the “swapped” model?

ENCODER CHANNELRANDOM PERMUTATION DECODER

𝑀 𝑋 𝑉 𝑊 𝑀

Proposition (Equivalent Models)

If channel PW |V is equal to channel PZ |X , then channel PW n1 |X n

1is equal to channel PY n

1 |X n1

.

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Remarks:

Proof follows from direct calculation.

Can analyze either model!Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 7 / 39

Page 39: Capacity of Permutation Channels - cs.purdue.edu

Coding for the Permutation Channel

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

General Principle:“Encode the information in an object that is invariant under the [permutation]transformation.” [KV13]

Multiset codes are studied in [KV13], [KV15], and [KT18].

What are the fundamentalinformation theoretic limits of this model?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 8 / 39

Page 40: Capacity of Permutation Channels - cs.purdue.edu

Coding for the Permutation Channel

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

General Principle:“Encode the information in an object that is invariant under the [permutation]transformation.” [KV13]

Multiset codes are studied in [KV13], [KV15], and [KT18].

What are the fundamentalinformation theoretic limits of this model?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 8 / 39

Page 41: Capacity of Permutation Channels - cs.purdue.edu

Coding for the Permutation Channel

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

General Principle:“Encode the information in an object that is invariant under the [permutation]transformation.” [KV13]

Multiset codes are studied in [KV13], [KV15], and [KT18].

What are the fundamentalinformation theoretic limits of this model?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 8 / 39

Page 42: Capacity of Permutation Channels - cs.purdue.edu

Information Capacity of the Permutation Channel

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Average probability of error Pnerror , P(M 6= M)

“Rate” of coding scheme (fn, gn) is R ,log(|M|)

log(n)

|M| = nR

Rate R ≥ 0 is achievable ⇔ ∃{(fn, gn)}n∈N such that limn→∞

Pnerror = 0

Definition (Permutation Channel Capacity)

Cperm(PZ |X ) , sup{R ≥ 0 : R is achievable}

Main Question

What is the permutation channel capacity of a general PZ |X?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 9 / 39

Page 43: Capacity of Permutation Channels - cs.purdue.edu

Information Capacity of the Permutation Channel

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Average probability of error Pnerror , P(M 6= M)

“Rate” of coding scheme (fn, gn) is R ,log(|M|)

log(n)

|M| = nR

Rate R ≥ 0 is achievable ⇔ ∃{(fn, gn)}n∈N such that limn→∞

Pnerror = 0

Definition (Permutation Channel Capacity)

Cperm(PZ |X ) , sup{R ≥ 0 : R is achievable}

Main Question

What is the permutation channel capacity of a general PZ |X?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 9 / 39

Page 44: Capacity of Permutation Channels - cs.purdue.edu

Information Capacity of the Permutation Channel

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Average probability of error Pnerror , P(M 6= M)

“Rate” of coding scheme (fn, gn) is R ,log(|M|)

log(n)

|M| = nR

Rate R ≥ 0 is achievable ⇔ ∃{(fn, gn)}n∈N such that limn→∞

Pnerror = 0

Definition (Permutation Channel Capacity)

Cperm(PZ |X ) , sup{R ≥ 0 : R is achievable}

Main Question

What is the permutation channel capacity of a general PZ |X?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 9 / 39

Page 45: Capacity of Permutation Channels - cs.purdue.edu

Information Capacity of the Permutation Channel

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Average probability of error Pnerror , P(M 6= M)

“Rate” of coding scheme (fn, gn) is R ,log(|M|)

log(n)

|M| = nR because number of empirical distributions of Y n1 is poly(n)

Rate R ≥ 0 is achievable ⇔ ∃{(fn, gn)}n∈N such that limn→∞

Pnerror = 0

Definition (Permutation Channel Capacity)

Cperm(PZ |X ) , sup{R ≥ 0 : R is achievable}

Main Question

What is the permutation channel capacity of a general PZ |X?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 9 / 39

Page 46: Capacity of Permutation Channels - cs.purdue.edu

Information Capacity of the Permutation Channel

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Average probability of error Pnerror , P(M 6= M)

“Rate” of coding scheme (fn, gn) is R ,log(|M|)

log(n)

|M| = nR

Rate R ≥ 0 is achievable ⇔ ∃{(fn, gn)}n∈N such that limn→∞

Pnerror = 0

Definition (Permutation Channel Capacity)

Cperm(PZ |X ) , sup{R ≥ 0 : R is achievable}

Main Question

What is the permutation channel capacity of a general PZ |X?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 9 / 39

Page 47: Capacity of Permutation Channels - cs.purdue.edu

Information Capacity of the Permutation Channel

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Average probability of error Pnerror , P(M 6= M)

“Rate” of coding scheme (fn, gn) is R ,log(|M|)

log(n)

|M| = nR

Rate R ≥ 0 is achievable ⇔ ∃{(fn, gn)}n∈N such that limn→∞

Pnerror = 0

Definition (Permutation Channel Capacity)

Cperm(PZ |X ) , sup{R ≥ 0 : R is achievable}

Main Question

What is the permutation channel capacity of a general PZ |X?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 9 / 39

Page 48: Capacity of Permutation Channels - cs.purdue.edu

Information Capacity of the Permutation Channel

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Average probability of error Pnerror , P(M 6= M)

“Rate” of coding scheme (fn, gn) is R ,log(|M|)

log(n)

|M| = nR

Rate R ≥ 0 is achievable ⇔ ∃{(fn, gn)}n∈N such that limn→∞

Pnerror = 0

Definition (Permutation Channel Capacity)

Cperm(PZ |X ) , sup{R ≥ 0 : R is achievable}

Main Question

What is the permutation channel capacity of a general PZ |X?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 9 / 39

Page 49: Capacity of Permutation Channels - cs.purdue.edu

Example: Binary Symmetric Channel

ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Channel is binary symmetric channel, denoted BSC(p):

∀z , x ∈ {0, 1}, PZ |X (z |x) =

{1− p, for z = x

p, for z 6= x

Alphabets are X = Y = {0, 1}Assume crossover probability p ∈ (0, 1) and p 6= 1

2

Question: What is the permutation channel capacity of the BSC?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 10 / 39

Page 50: Capacity of Permutation Channels - cs.purdue.edu

Example: Binary Symmetric Channel

ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Channel is binary symmetric channel, denoted BSC(p):

∀z , x ∈ {0, 1}, PZ |X (z |x) =

{1− p, for z = x

p, for z 6= x

Alphabets are X = Y = {0, 1}

Assume crossover probability p ∈ (0, 1) and p 6= 12

Question: What is the permutation channel capacity of the BSC?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 10 / 39

Page 51: Capacity of Permutation Channels - cs.purdue.edu

Example: Binary Symmetric Channel

ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Channel is binary symmetric channel, denoted BSC(p):

∀z , x ∈ {0, 1}, PZ |X (z |x) =

{1− p, for z = x

p, for z 6= x

Alphabets are X = Y = {0, 1}Assume crossover probability p ∈ (0, 1) and p 6= 1

2

Question: What is the permutation channel capacity of the BSC?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 10 / 39

Page 52: Capacity of Permutation Channels - cs.purdue.edu

Example: Binary Symmetric Channel

ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Channel is binary symmetric channel, denoted BSC(p):

∀z , x ∈ {0, 1}, PZ |X (z |x) =

{1− p, for z = x

p, for z 6= x

Alphabets are X = Y = {0, 1}Assume crossover probability p ∈ (0, 1) and p 6= 1

2

Question: What is the permutation channel capacity of the BSC?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 10 / 39

Page 53: Capacity of Permutation Channels - cs.purdue.edu

Outline

1 Introduction

2 Achievability and Converse for the BSCEncoder and DecoderTesting between Converging HypothesesSecond Moment Method for TV DistanceFano’s Inequality and CLT Approximation

3 General Achievability Bound

4 General Converse Bounds

5 Conclusion

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 11 / 39

Page 54: Capacity of Permutation Channels - cs.purdue.edu

Warm-up: Sending Two Messages

ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Fix a message m ∈ {0, 1}

, and encode m as fn(m) = X n1

i.i.d.∼ Ber(qm)

𝑞13

𝑞23

10

Memoryless BSC(p) outputs Zn1

i.i.d.∼ Ber(p ∗ qm), where p ∗ qm , p(1− qm) + qm(1− p)is the convolution of p and qm

Random permutation generates Y n1

i.i.d.∼ Ber(p ∗ qm)

Maximum Likelihood (ML) decoder: M = 1{1n

∑ni=1 Yi ≥ 1

2

}(for p < 1

2)1n

∑ni=1 Yi → p ∗ qm in probability as n→∞ ⇒ lim

n→∞Pnerror = 0 as p ∗ q0 6= p ∗ q1

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 12 / 39

Page 55: Capacity of Permutation Channels - cs.purdue.edu

Warm-up: Sending Two Messages

ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Fix a message m ∈ {0, 1}, and encode m as fn(m) = X n1

i.i.d.∼ Ber(qm)

𝑞13

𝑞23

10

Memoryless BSC(p) outputs Zn1

i.i.d.∼ Ber(p ∗ qm), where p ∗ qm , p(1− qm) + qm(1− p)is the convolution of p and qm

Random permutation generates Y n1

i.i.d.∼ Ber(p ∗ qm)

Maximum Likelihood (ML) decoder: M = 1{1n

∑ni=1 Yi ≥ 1

2

}(for p < 1

2)1n

∑ni=1 Yi → p ∗ qm in probability as n→∞ ⇒ lim

n→∞Pnerror = 0 as p ∗ q0 6= p ∗ q1

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 12 / 39

Page 56: Capacity of Permutation Channels - cs.purdue.edu

Warm-up: Sending Two Messages

ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Fix a message m ∈ {0, 1}, and encode m as fn(m) = X n1

i.i.d.∼ Ber(qm)

𝑞13

𝑞23

10

Memoryless BSC(p) outputs Zn1

i.i.d.∼ Ber(p ∗ qm), where p ∗ qm , p(1− qm) + qm(1− p)is the convolution of p and qm

Random permutation generates Y n1

i.i.d.∼ Ber(p ∗ qm)

Maximum Likelihood (ML) decoder: M = 1{1n

∑ni=1 Yi ≥ 1

2

}(for p < 1

2)1n

∑ni=1 Yi → p ∗ qm in probability as n→∞ ⇒ lim

n→∞Pnerror = 0 as p ∗ q0 6= p ∗ q1

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 12 / 39

Page 57: Capacity of Permutation Channels - cs.purdue.edu

Warm-up: Sending Two Messages

ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Fix a message m ∈ {0, 1}, and encode m as fn(m) = X n1

i.i.d.∼ Ber(qm)

𝑞13

𝑞23

10

Memoryless BSC(p) outputs Zn1

i.i.d.∼ Ber(p ∗ qm), where p ∗ qm , p(1− qm) + qm(1− p)is the convolution of p and qm

Random permutation generates Y n1

i.i.d.∼ Ber(p ∗ qm)

Maximum Likelihood (ML) decoder: M = 1{1n

∑ni=1 Yi ≥ 1

2

}(for p < 1

2)1n

∑ni=1 Yi → p ∗ qm in probability as n→∞ ⇒ lim

n→∞Pnerror = 0 as p ∗ q0 6= p ∗ q1

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 12 / 39

Page 58: Capacity of Permutation Channels - cs.purdue.edu

Warm-up: Sending Two Messages

ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Fix a message m ∈ {0, 1}, and encode m as fn(m) = X n1

i.i.d.∼ Ber(qm)

𝑞13

𝑞23

10

Memoryless BSC(p) outputs Zn1

i.i.d.∼ Ber(p ∗ qm), where p ∗ qm , p(1− qm) + qm(1− p)is the convolution of p and qm

Random permutation generates Y n1

i.i.d.∼ Ber(p ∗ qm)

Maximum Likelihood (ML) decoder: M = 1{1n

∑ni=1 Yi ≥ 1

2

}(for p < 1

2)

1n

∑ni=1 Yi → p ∗ qm in probability as n→∞ ⇒ lim

n→∞Pnerror = 0 as p ∗ q0 6= p ∗ q1

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 12 / 39

Page 59: Capacity of Permutation Channels - cs.purdue.edu

Warm-up: Sending Two Messages

ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Fix a message m ∈ {0, 1}, and encode m as fn(m) = X n1

i.i.d.∼ Ber(qm)

𝑞13

𝑞23

10

Memoryless BSC(p) outputs Zn1

i.i.d.∼ Ber(p ∗ qm), where p ∗ qm , p(1− qm) + qm(1− p)is the convolution of p and qm

Random permutation generates Y n1

i.i.d.∼ Ber(p ∗ qm)

Maximum Likelihood (ML) decoder: M = 1{1n

∑ni=1 Yi ≥ 1

2

}(for p < 1

2)1n

∑ni=1 Yi → p ∗ qm in probability as n→∞ ⇒ lim

n→∞Pnerror = 0 as p ∗ q0 6= p ∗ q1

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 12 / 39

Page 60: Capacity of Permutation Channels - cs.purdue.edu

Encoder and Decoder

Suppose M = {1, . . . , nR} for some R > 0

Randomized encoder: Given m ∈M, fn(m) = X n1

i.i.d.∼ Ber( m

nR

)

10𝑛

Given m ∈M, Y n1

i.i.d.∼ Ber(p ∗ m

nR

)ML decoder: For yn1 ∈ {0, 1}n, gn(yn1 ) = arg max

m∈MPY n

1 |M(yn1 |m)

Trade-off: Although 1n

∑ni=1 Yi → p ∗ m

nRin probability as n→∞, consecutive messages

become indistinguishable, i.e. mnR− m+1

nR→ 0

Fact: Consecutive messages distinguishable ⇒ limn→∞

Pnerror = 0

What is the largest R such that two consecutive messages can be distinguished?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 13 / 39

Page 61: Capacity of Permutation Channels - cs.purdue.edu

Encoder and Decoder

Suppose M = {1, . . . , nR} for some R > 0

Randomized encoder: Given m ∈M, fn(m) = X n1

i.i.d.∼ Ber( m

nR

)

10𝑛

Given m ∈M, Y n1

i.i.d.∼ Ber(p ∗ m

nR

)ML decoder: For yn1 ∈ {0, 1}n, gn(yn1 ) = arg max

m∈MPY n

1 |M(yn1 |m)

Trade-off: Although 1n

∑ni=1 Yi → p ∗ m

nRin probability as n→∞, consecutive messages

become indistinguishable, i.e. mnR− m+1

nR→ 0

Fact: Consecutive messages distinguishable ⇒ limn→∞

Pnerror = 0

What is the largest R such that two consecutive messages can be distinguished?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 13 / 39

Page 62: Capacity of Permutation Channels - cs.purdue.edu

Encoder and Decoder

Suppose M = {1, . . . , nR} for some R > 0

Randomized encoder: Given m ∈M, fn(m) = X n1

i.i.d.∼ Ber( m

nR

)

10𝑛

Given m ∈M, Y n1

i.i.d.∼ Ber(p ∗ m

nR

)(as before)

ML decoder: For yn1 ∈ {0, 1}n, gn(yn1 ) = arg maxm∈M

PY n1 |M(yn1 |m)

Trade-off: Although 1n

∑ni=1 Yi → p ∗ m

nRin probability as n→∞, consecutive messages

become indistinguishable, i.e. mnR− m+1

nR→ 0

Fact: Consecutive messages distinguishable ⇒ limn→∞

Pnerror = 0

What is the largest R such that two consecutive messages can be distinguished?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 13 / 39

Page 63: Capacity of Permutation Channels - cs.purdue.edu

Encoder and Decoder

Suppose M = {1, . . . , nR} for some R > 0

Randomized encoder: Given m ∈M, fn(m) = X n1

i.i.d.∼ Ber( m

nR

)

10𝑛

Given m ∈M, Y n1

i.i.d.∼ Ber(p ∗ m

nR

)ML decoder: For yn1 ∈ {0, 1}n, gn(yn1 ) = arg max

m∈MPY n

1 |M(yn1 |m)

Trade-off: Although 1n

∑ni=1 Yi → p ∗ m

nRin probability as n→∞, consecutive messages

become indistinguishable, i.e. mnR− m+1

nR→ 0

Fact: Consecutive messages distinguishable ⇒ limn→∞

Pnerror = 0

What is the largest R such that two consecutive messages can be distinguished?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 13 / 39

Page 64: Capacity of Permutation Channels - cs.purdue.edu

Encoder and Decoder

Suppose M = {1, . . . , nR} for some R > 0

Randomized encoder: Given m ∈M, fn(m) = X n1

i.i.d.∼ Ber( m

nR

)

10𝑛

Given m ∈M, Y n1

i.i.d.∼ Ber(p ∗ m

nR

)ML decoder: For yn1 ∈ {0, 1}n, gn(yn1 ) = arg max

m∈MPY n

1 |M(yn1 |m)

Trade-off: Although 1n

∑ni=1 Yi → p ∗ m

nRin probability as n→∞, consecutive messages

become indistinguishable, i.e. mnR− m+1

nR→ 0

Fact: Consecutive messages distinguishable ⇒ limn→∞

Pnerror = 0

What is the largest R such that two consecutive messages can be distinguished?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 13 / 39

Page 65: Capacity of Permutation Channels - cs.purdue.edu

Encoder and Decoder

Suppose M = {1, . . . , nR} for some R > 0

Randomized encoder: Given m ∈M, fn(m) = X n1

i.i.d.∼ Ber( m

nR

)

10𝑛

Given m ∈M, Y n1

i.i.d.∼ Ber(p ∗ m

nR

)ML decoder: For yn1 ∈ {0, 1}n, gn(yn1 ) = arg max

m∈MPY n

1 |M(yn1 |m)

Trade-off: Although 1n

∑ni=1 Yi → p ∗ m

nRin probability as n→∞, consecutive messages

become indistinguishable, i.e. mnR− m+1

nR→ 0

Fact: Consecutive messages distinguishable ⇒ limn→∞

Pnerror = 0

What is the largest R such that two consecutive messages can be distinguished?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 13 / 39

Page 66: Capacity of Permutation Channels - cs.purdue.edu

Encoder and Decoder

Suppose M = {1, . . . , nR} for some R > 0

Randomized encoder: Given m ∈M, fn(m) = X n1

i.i.d.∼ Ber( m

nR

)

10𝑛

Given m ∈M, Y n1

i.i.d.∼ Ber(p ∗ m

nR

)ML decoder: For yn1 ∈ {0, 1}n, gn(yn1 ) = arg max

m∈MPY n

1 |M(yn1 |m)

Trade-off: Although 1n

∑ni=1 Yi → p ∗ m

nRin probability as n→∞, consecutive messages

become indistinguishable, i.e. mnR− m+1

nR→ 0

Fact: Consecutive messages distinguishable ⇒ limn→∞

Pnerror = 0

What is the largest R such that two consecutive messages can be distinguished?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 13 / 39

Page 67: Capacity of Permutation Channels - cs.purdue.edu

Testing between Converging Hypotheses

Binary Hypothesis Testing:

Consider hypothesis H ∼ Ber(12

)with uniform prior

For any n ∈ N, q ∈ (0, 1), and R > 0, consider likelihoods:

Given H = 0 : X n1

i.i.d.∼ PX |H=0 = Ber(q)

Given H = 1 : X n1

i.i.d.∼ PX |H=1 = Ber

(q +

1

nR

)Define the zero-mean sufficient statistic of X n

1 for H:

Tn ,1

n

n∑i=1

Xi − q − 1

2nR

Let HnML(Tn) denote the ML decoder for H based on Tn with minimum probability of

error PnML , P(Hn

ML(Tn) 6= H)Want: Largest R > 0 such that lim

n→∞PnML = 0?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 14 / 39

Page 68: Capacity of Permutation Channels - cs.purdue.edu

Testing between Converging Hypotheses

Binary Hypothesis Testing:

Consider hypothesis H ∼ Ber(12

)with uniform prior

For any n ∈ N, q ∈ (0, 1), and R > 0, consider likelihoods:

Given H = 0 : X n1

i.i.d.∼ PX |H=0 = Ber(q)

Given H = 1 : X n1

i.i.d.∼ PX |H=1 = Ber

(q +

1

nR

)

Define the zero-mean sufficient statistic of X n1 for H:

Tn ,1

n

n∑i=1

Xi − q − 1

2nR

Let HnML(Tn) denote the ML decoder for H based on Tn with minimum probability of

error PnML , P(Hn

ML(Tn) 6= H)Want: Largest R > 0 such that lim

n→∞PnML = 0?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 14 / 39

Page 69: Capacity of Permutation Channels - cs.purdue.edu

Testing between Converging Hypotheses

Binary Hypothesis Testing:

Consider hypothesis H ∼ Ber(12

)with uniform prior

For any n ∈ N, q ∈ (0, 1), and R > 0, consider likelihoods:

Given H = 0 : X n1

i.i.d.∼ PX |H=0 = Ber(q)

Given H = 1 : X n1

i.i.d.∼ PX |H=1 = Ber

(q +

1

nR

)Define the zero-mean sufficient statistic of X n

1 for H:

Tn ,1

n

n∑i=1

Xi − q − 1

2nR

Let HnML(Tn) denote the ML decoder for H based on Tn with minimum probability of

error PnML , P(Hn

ML(Tn) 6= H)Want: Largest R > 0 such that lim

n→∞PnML = 0?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 14 / 39

Page 70: Capacity of Permutation Channels - cs.purdue.edu

Testing between Converging Hypotheses

Binary Hypothesis Testing:

Consider hypothesis H ∼ Ber(12

)with uniform prior

For any n ∈ N, q ∈ (0, 1), and R > 0, consider likelihoods:

Given H = 0 : X n1

i.i.d.∼ PX |H=0 = Ber(q)

Given H = 1 : X n1

i.i.d.∼ PX |H=1 = Ber

(q +

1

nR

)Define the zero-mean sufficient statistic of X n

1 for H:

Tn ,1

n

n∑i=1

Xi − q − 1

2nR

Let HnML(Tn) denote the ML decoder for H based on Tn with minimum probability of

error PnML , P(Hn

ML(Tn) 6= H)

Want: Largest R > 0 such that limn→∞

PnML = 0?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 14 / 39

Page 71: Capacity of Permutation Channels - cs.purdue.edu

Testing between Converging Hypotheses

Binary Hypothesis Testing:

Consider hypothesis H ∼ Ber(12

)with uniform prior

For any n ∈ N, q ∈ (0, 1), and R > 0, consider likelihoods:

Given H = 0 : X n1

i.i.d.∼ PX |H=0 = Ber(q)

Given H = 1 : X n1

i.i.d.∼ PX |H=1 = Ber

(q +

1

nR

)Define the zero-mean sufficient statistic of X n

1 for H:

Tn ,1

n

n∑i=1

Xi − q − 1

2nR

Let HnML(Tn) denote the ML decoder for H based on Tn with minimum probability of

error PnML , P(Hn

ML(Tn) 6= H)Want: Largest R > 0 such that lim

n→∞PnML = 0?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 14 / 39

Page 72: Capacity of Permutation Channels - cs.purdue.edu

Intuition via Central Limit Theorem

For large n, PTn|H(·|0) and PTn|H(·|1) are Gaussian distributions

|E[Tn|H = 0]− E[Tn|H = 1]| = 1/nR

Standard deviations are Θ(1/√n)

Figure:

𝑡0

𝑃 | 𝑡|0 𝑃 | 𝑡|1

12𝑛

12𝑛

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 15 / 39

Page 73: Capacity of Permutation Channels - cs.purdue.edu

Intuition via Central Limit Theorem

For large n, PTn|H(·|0) and PTn|H(·|1) are Gaussian distributions

|E[Tn|H = 0]− E[Tn|H = 1]| = 1/nR

Standard deviations are Θ(1/√n)

Figure:

𝑡0

𝑃 | 𝑡|0 𝑃 | 𝑡|1

𝛳1𝑛

1𝑛

𝛳1𝑛

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 15 / 39

Page 74: Capacity of Permutation Channels - cs.purdue.edu

Intuition via Central Limit Theorem

For large n, PTn|H(·|0) and PTn|H(·|1) are Gaussian distributions

|E[Tn|H = 0]− E[Tn|H = 1]| = 1/nR

Standard deviations are Θ(1/√n)

Figure:

𝑡0

𝑃 | 𝑡|0 𝑃 | 𝑡|1

𝛳1𝑛

1𝑛

𝛳1𝑛

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 15 / 39

Page 75: Capacity of Permutation Channels - cs.purdue.edu

Intuition via Central Limit Theorem

For large n, PTn|H(·|0) and PTn|H(·|1) are Gaussian distributions

|E[Tn|H = 0]− E[Tn|H = 1]| = 1/nR

Standard deviations are Θ(1/√n)

Case R < 12 :

𝑡0

𝑃 | 𝑡|0 𝑃 | 𝑡|1

𝛳1𝑛

1𝑛

𝛳1𝑛

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 15 / 39

Page 76: Capacity of Permutation Channels - cs.purdue.edu

Intuition via Central Limit Theorem

For large n, PTn|H(·|0) and PTn|H(·|1) are Gaussian distributions

|E[Tn|H = 0]− E[Tn|H = 1]| = 1/nR

Standard deviations are Θ(1/√n)

Case R < 12 : Decoding is possible ,

𝑡0

𝑃 | 𝑡|0 𝑃 | 𝑡|1

𝛳1𝑛

1𝑛

𝛳1𝑛

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 15 / 39

Page 77: Capacity of Permutation Channels - cs.purdue.edu

Intuition via Central Limit Theorem

For large n, PTn|H(·|0) and PTn|H(·|1) are Gaussian distributions

|E[Tn|H = 0]− E[Tn|H = 1]| = 1/nR

Standard deviations are Θ(1/√n)

Case R > 12 :

𝑡0

𝑃 | 𝑡|0 𝑃 | 𝑡|1

𝛳1𝑛

1𝑛

𝛳1𝑛

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 15 / 39

Page 78: Capacity of Permutation Channels - cs.purdue.edu

Intuition via Central Limit Theorem

For large n, PTn|H(·|0) and PTn|H(·|1) are Gaussian distributions

|E[Tn|H = 0]− E[Tn|H = 1]| = 1/nR

Standard deviations are Θ(1/√n)

Case R > 12 : Decoding is impossible /

𝑡0

𝑃 | 𝑡|0 𝑃 | 𝑡|1

𝛳1𝑛

1𝑛

𝛳1𝑛

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 15 / 39

Page 79: Capacity of Permutation Channels - cs.purdue.edu

Second Moment Method for TV Distance

Lemma (2nd Moment Method [EKPS00])∥∥PTn|H=1 − PTn|H=0

∥∥TV≥ (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

where ‖P − Q‖TV = 12 ‖P − Q‖1 denotes the total variation (TV) distance between the

distributions P and Q.

Proof:

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 16 / 39

Page 80: Capacity of Permutation Channels - cs.purdue.edu

Second Moment Method for TV Distance

Lemma (2nd Moment Method [EKPS00])∥∥PTn|H=1 − PTn|H=0

∥∥TV≥ (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

where ‖P − Q‖TV = 12 ‖P − Q‖1 denotes the total variation (TV) distance between the

distributions P and Q.

Proof: Let T+n ∼ PTn|H=1 and T−n ∼ PTn|H=0(

E[T+n

]− E

[T−n])2

=

(∑t

t(PTn|H(t|1)− PTn|H(t|0)

))2

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 16 / 39

Page 81: Capacity of Permutation Channels - cs.purdue.edu

Second Moment Method for TV Distance

Lemma (2nd Moment Method [EKPS00])∥∥PTn|H=1 − PTn|H=0

∥∥TV≥ (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

where ‖P − Q‖TV = 12 ‖P − Q‖1 denotes the total variation (TV) distance between the

distributions P and Q.

Proof: Let T+n ∼ PTn|H=1 and T−n ∼ PTn|H=0(E[T+n

]− E

[T−n])2

=

(∑t

t√

PTn(t)

(PTn|H(t|1)− PTn|H(t|0)

)√PTn(t)

)2

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 16 / 39

Page 82: Capacity of Permutation Channels - cs.purdue.edu

Second Moment Method for TV Distance

Lemma (2nd Moment Method [EKPS00])∥∥PTn|H=1 − PTn|H=0

∥∥TV≥ (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

where ‖P − Q‖TV = 12 ‖P − Q‖1 denotes the total variation (TV) distance between the

distributions P and Q.

Proof: Cauchy-Schwarz inequality(E[T+n

]− E

[T−n])2

=

(∑t

t√PTn(t)

(PTn|H(t|1)− PTn|H(t|0)

)√PTn(t)

)2

(∑t

t2PTn(t)

)(∑t

(PTn|H(t|1)− PTn|H(t|0)

)2PTn(t)

)

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 16 / 39

Page 83: Capacity of Permutation Channels - cs.purdue.edu

Second Moment Method for TV Distance

Lemma (2nd Moment Method [EKPS00])∥∥PTn|H=1 − PTn|H=0

∥∥TV≥ (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

where ‖P − Q‖TV = 12 ‖P − Q‖1 denotes the total variation (TV) distance between the

distributions P and Q.

Proof: Recall that Tn is zero-mean(E[T+n

]− E

[T−n])2

=

(∑t

t√

PTn(t)

(PTn|H(t|1)− PTn|H(t|0)

)√PTn(t)

)2

≤ VAR(Tn)

(∑t

(PTn|H(t|1)− PTn|H(t|0)

)2PTn(t)

)

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 16 / 39

Page 84: Capacity of Permutation Channels - cs.purdue.edu

Second Moment Method for TV Distance

Lemma (2nd Moment Method [EKPS00])∥∥PTn|H=1 − PTn|H=0

∥∥TV≥ (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

where ‖P − Q‖TV = 12 ‖P − Q‖1 denotes the total variation (TV) distance between the

distributions P and Q.

Proof: Hammersley-Chapman-Robbins bound(E[T+n

]− E

[T−n])2

=

(∑t

t√PTn(t)

(PTn|H(t|1)− PTn|H(t|0)

)√PTn(t)

)2

≤ 4VAR(Tn)

(1

4

∑t

(PTn|H(t|1)− PTn|H(t|0)

)2PTn(t)

)︸ ︷︷ ︸

Vincze-Le Cam distance

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 16 / 39

Page 85: Capacity of Permutation Channels - cs.purdue.edu

Second Moment Method for TV Distance

Lemma (2nd Moment Method [EKPS00])∥∥PTn|H=1 − PTn|H=0

∥∥TV≥ (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

where ‖P − Q‖TV = 12 ‖P − Q‖1 denotes the total variation (TV) distance between the

distributions P and Q.

Proof: (E[T+n

]− E

[T−n])2

=

(∑t

t√PTn(t)

(PTn|H(t|1)− PTn|H(t|0)

)√PTn(t)

)2

≤ 4VAR(Tn)

(1

4

∑t

(PTn|H(t|1)− PTn|H(t|0)

)2PTn(t)

)≤ 4VAR(Tn)

∥∥PTn|H=1 − PTn|H=0

∥∥TV

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 16 / 39

Page 86: Capacity of Permutation Channels - cs.purdue.edu

BSC Achievability Proof

Proposition (BSC Achievability)

For any 0 < R < 1/2, consider the binary hypothesis testing problem with H ∼ Ber(12

), and

X n1

i.i.d.∼ Ber(q + h

nR

)given H = h ∈ {0, 1}.

Then, limn→∞

PnML = 0. This implies that:

Cperm(BSC(p)) ≥ 1

2.

Proof: Start with Le Cam’s relation

PnML =

1

2

(1−

∥∥PTn|H=1 − PTn|H=0

∥∥TV

)

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 17 / 39

Page 87: Capacity of Permutation Channels - cs.purdue.edu

BSC Achievability Proof

Proposition (BSC Achievability)

For any 0 < R < 1/2, consider the binary hypothesis testing problem with H ∼ Ber(12

), and

X n1

i.i.d.∼ Ber(q + h

nR

)given H = h ∈ {0, 1}.

Then, limn→∞

PnML = 0. This implies that:

Cperm(BSC(p)) ≥ 1

2.

Proof: Apply second moment method lemma

PnML =

1

2

(1−

∥∥PTn|H=1 − PTn|H=0

∥∥TV

)≤ 1

2

(1− (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

)

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 17 / 39

Page 88: Capacity of Permutation Channels - cs.purdue.edu

BSC Achievability Proof

Proposition (BSC Achievability)

For any 0 < R < 1/2, consider the binary hypothesis testing problem with H ∼ Ber(12

), and

X n1

i.i.d.∼ Ber(q + h

nR

)given H = h ∈ {0, 1}.

Then, limn→∞

PnML = 0. This implies that:

Cperm(BSC(p)) ≥ 1

2.

Proof: After explicit computation and simplification...

PnML =

1

2

(1−

∥∥PTn|H=1 − PTn|H=0

∥∥TV

)≤ 1

2

(1− (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

)

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 17 / 39

Page 89: Capacity of Permutation Channels - cs.purdue.edu

BSC Achievability Proof

Proposition (BSC Achievability)

For any 0 < R < 1/2, consider the binary hypothesis testing problem with H ∼ Ber(12

), and

X n1

i.i.d.∼ Ber(q + h

nR

)given H = h ∈ {0, 1}.

Then, limn→∞

PnML = 0. This implies that:

Cperm(BSC(p)) ≥ 1

2.

Proof: For any 0 < R < 12 ,

PnML =

1

2

(1−

∥∥PTn|H=1 − PTn|H=0

∥∥TV

)≤ 1

2

(1− (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

)≤ 3

2n1−2RAnuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 17 / 39

Page 90: Capacity of Permutation Channels - cs.purdue.edu

BSC Achievability Proof

Proposition (BSC Achievability)

For any 0 < R < 1/2, consider the binary hypothesis testing problem with H ∼ Ber(12

), and

X n1

i.i.d.∼ Ber(q + h

nR

)given H = h ∈ {0, 1}.

Then, limn→∞

PnML = 0.

This implies that:

Cperm(BSC(p)) ≥ 1

2.

Proof: For any 0 < R < 12 ,

PnML =

1

2

(1−

∥∥PTn|H=1 − PTn|H=0

∥∥TV

)≤ 1

2

(1− (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

)≤ 3

2n1−2R→ 0 as n→∞

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 17 / 39

Page 91: Capacity of Permutation Channels - cs.purdue.edu

BSC Achievability Proof

Proposition (BSC Achievability)

For any 0 < R < 1/2, consider the binary hypothesis testing problem with H ∼ Ber(12

), and

X n1

i.i.d.∼ Ber(q + h

nR

)given H = h ∈ {0, 1}.

Then, limn→∞

PnML = 0. This implies that:

Cperm(BSC(p)) ≥ 1

2.

Proof: For any 0 < R < 12 ,

PnML =

1

2

(1−

∥∥PTn|H=1 − PTn|H=0

∥∥TV

)≤ 1

2

(1− (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

)≤ 3

2n1−2R→ 0 as n→∞

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 17 / 39

Page 92: Capacity of Permutation Channels - cs.purdue.edu

Outline

1 Introduction

2 Achievability and Converse for the BSCEncoder and DecoderTesting between Converging HypothesesSecond Moment Method for TV DistanceFano’s Inequality and CLT Approximation

3 General Achievability Bound

4 General Converse Bounds

5 Conclusion

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 18 / 39

Page 93: Capacity of Permutation Channels - cs.purdue.edu

Recall: Two Information Inequalities

Consider discrete random variables X ,Y ,Z that form a Markov chain X → Y → Z .

Lemma (Data Processing Inequality [CT06])

I (X ;Z ) ≤ I (X ;Y )

with equality if and only if Z is a sufficient statistic of Y for X , i.e., X → Z → Y also forms aMarkov chain.

Lemma (Fano’s Inequality [CT06])

If X takes values in the finite alphabet X , then

H(X |Z ) ≤ 1 + P(X 6= Z ) log(|X |)

where we perceive Z as an estimator for X based on Y .

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 19 / 39

Page 94: Capacity of Permutation Channels - cs.purdue.edu

Recall: Two Information Inequalities

Consider discrete random variables X ,Y ,Z that form a Markov chain X → Y → Z .

Lemma (Data Processing Inequality [CT06])

I (X ;Z ) ≤ I (X ;Y )

with equality if and only if Z is a sufficient statistic of Y for X , i.e., X → Z → Y also forms aMarkov chain.

Lemma (Fano’s Inequality [CT06])

If X takes values in the finite alphabet X , then

H(X |Z ) ≤ 1 + P(X 6= Z ) log(|X |)

where we perceive Z as an estimator for X based on Y .

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 19 / 39

Page 95: Capacity of Permutation Channels - cs.purdue.edu

Recall: Two Information Inequalities

Consider discrete random variables X ,Y ,Z that form a Markov chain X → Y → Z .

Lemma (Data Processing Inequality [CT06])

I (X ;Z ) ≤ I (X ;Y )

with equality if and only if Z is a sufficient statistic of Y for X , i.e., X → Z → Y also forms aMarkov chain.

Lemma (Fano’s Inequality [CT06])

If X takes values in the finite alphabet X , then

H(X |Z ) ≤ 1 + P(X 6= Z ) log(|X |)

where we perceive Z as an estimator for X based on Y .

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 19 / 39

Page 96: Capacity of Permutation Channels - cs.purdue.edu

BSC Converse Proof: Fano’s Inequality Argument

Consider the Markov chain M → X n1 → Zn

1 → Y n1 → Sn ,

∑ni=1 Yi → M, and a

sequence of encoder-decoder pairs {(fn, gn)}n∈N such that |M| = nR and limn→∞

Pnerror = 0

Standard argument [CT06]:

R log(n)

= H(M|M) + I (M; M)

≤ 1 + PnerrorR log(n) + I (M;Y n

1 )

= 1 + PnerrorR log(n) + I (M;Sn)

≤ 1 + PnerrorR log(n) + I (X n

1 ;Sn)

Divide by log(n)

and let n→∞:

R ≤ I (X n1 ; Sn)

log(n)

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 20 / 39

Page 97: Capacity of Permutation Channels - cs.purdue.edu

BSC Converse Proof: Fano’s Inequality Argument

Consider the Markov chain M → X n1 → Zn

1 → Y n1 → Sn ,

∑ni=1 Yi → M, and a

sequence of encoder-decoder pairs {(fn, gn)}n∈N such that |M| = nR and limn→∞

Pnerror = 0

Standard argument [CT06]: M is uniform

R log(n) = H(M)

= H(M|M) + I (M; M)

≤ 1 + PnerrorR log(n) + I (M;Y n

1 )

= 1 + PnerrorR log(n) + I (M;Sn)

≤ 1 + PnerrorR log(n) + I (X n

1 ;Sn)

Divide by log(n)

and let n→∞:

R ≤ I (X n1 ; Sn)

log(n)

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 20 / 39

Page 98: Capacity of Permutation Channels - cs.purdue.edu

BSC Converse Proof: Fano’s Inequality Argument

Consider the Markov chain M → X n1 → Zn

1 → Y n1 → Sn ,

∑ni=1 Yi → M, and a

sequence of encoder-decoder pairs {(fn, gn)}n∈N such that |M| = nR and limn→∞

Pnerror = 0

Standard argument [CT06]: Fano’s inequality, data processing inequality

R log(n) = H(M|M) + I (M; M)

≤ 1 + PnerrorR log(n) + I (M;Y n

1 )

= 1 + PnerrorR log(n) + I (M;Sn)

≤ 1 + PnerrorR log(n) + I (X n

1 ;Sn)

Divide by log(n)

and let n→∞:

R ≤ I (X n1 ; Sn)

log(n)

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 20 / 39

Page 99: Capacity of Permutation Channels - cs.purdue.edu

BSC Converse Proof: Fano’s Inequality Argument

Consider the Markov chain M → X n1 → Zn

1 → Y n1 → Sn ,

∑ni=1 Yi → M, and a

sequence of encoder-decoder pairs {(fn, gn)}n∈N such that |M| = nR and limn→∞

Pnerror = 0

Standard argument [CT06]: sufficiency

R log(n) = H(M|M) + I (M; M)

≤ 1 + PnerrorR log(n) + I (M;Y n

1 )

= 1 + PnerrorR log(n) + I (M; Sn)

≤ 1 + PnerrorR log(n) + I (X n

1 ;Sn)

Divide by log(n)

and let n→∞:

R ≤ I (X n1 ; Sn)

log(n)

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 20 / 39

Page 100: Capacity of Permutation Channels - cs.purdue.edu

BSC Converse Proof: Fano’s Inequality Argument

Consider the Markov chain M → X n1 → Zn

1 → Y n1 → Sn ,

∑ni=1 Yi → M, and a

sequence of encoder-decoder pairs {(fn, gn)}n∈N such that |M| = nR and limn→∞

Pnerror = 0

Standard argument [CT06]: data processing inequality

R log(n) = H(M|M) + I (M; M)

≤ 1 + PnerrorR log(n) + I (M;Y n

1 )

= 1 + PnerrorR log(n) + I (M; Sn)

≤ 1 + PnerrorR log(n) + I (X n

1 ;Sn)

Divide by log(n)

and let n→∞:

R ≤ I (X n1 ; Sn)

log(n)

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 20 / 39

Page 101: Capacity of Permutation Channels - cs.purdue.edu

BSC Converse Proof: Fano’s Inequality Argument

Consider the Markov chain M → X n1 → Zn

1 → Y n1 → Sn ,

∑ni=1 Yi → M, and a

sequence of encoder-decoder pairs {(fn, gn)}n∈N such that |M| = nR and limn→∞

Pnerror = 0

Standard argument [CT06]:

R log(n) = H(M|M) + I (M; M)

≤ 1 + PnerrorR log(n) + I (M;Y n

1 )

= 1 + PnerrorR log(n) + I (M; Sn)

≤ 1 + PnerrorR log(n) + I (X n

1 ;Sn)

Divide by log(n)

and let n→∞:

R ≤ 1

log(n)+ Pn

errorR +I (X n

1 ; Sn)

log(n)

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 20 / 39

Page 102: Capacity of Permutation Channels - cs.purdue.edu

BSC Converse Proof: Fano’s Inequality Argument

Consider the Markov chain M → X n1 → Zn

1 → Y n1 → Sn ,

∑ni=1 Yi → M, and a

sequence of encoder-decoder pairs {(fn, gn)}n∈N such that |M| = nR and limn→∞

Pnerror = 0

Standard argument [CT06]:

R log(n) = H(M|M) + I (M; M)

≤ 1 + PnerrorR log(n) + I (M;Y n

1 )

= 1 + PnerrorR log(n) + I (M; Sn)

≤ 1 + PnerrorR log(n) + I (X n

1 ;Sn)

Divide by log(n) and let n→∞:

R ≤ limn→∞

I (X n1 ; Sn)

log(n)

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 20 / 39

Page 103: Capacity of Permutation Channels - cs.purdue.edu

BSC Converse Proof: CLT Approximation

Upper bound on I (X n1 ;Sn):

I (X n1 ;Sn) = H(Sn)− H(Sn|X n

1 )

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H(bin(k , 1− p) + bin(n − k , p))

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H

(bin(n

2, p))

= log(n + 1)− 1

2log(πep(1− p)n) + O

(1

n

)Hence, we have R ≤ lim

n→∞I (X n

1 ; Sn)/log(n) = 12 .

Proposition (BSC Converse)

Cperm(BSC(p)) ≤ 1

2

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 21 / 39

Page 104: Capacity of Permutation Channels - cs.purdue.edu

BSC Converse Proof: CLT Approximation

Since Sn ∈ {0, . . . , n},I (X n

1 ;Sn) = H(Sn)− H(Sn|X n1 )

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H(Sn|X n

1 = xn1 )

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H(bin(k , 1− p) + bin(n − k , p))

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H

(bin(n

2, p))

= log(n + 1)− 1

2log(πep(1− p)n) + O

(1

n

)Hence, we have R ≤ lim

n→∞I (X n

1 ; Sn)/log(n) = 12 .

Proposition (BSC Converse)

Cperm(BSC(p)) ≤ 1

2

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 21 / 39

Page 105: Capacity of Permutation Channels - cs.purdue.edu

BSC Converse Proof: CLT Approximation

Given X n1 = xn1 with

∑ni=1 xi = k , Sn = bin(k , 1− p) + bin(n − k , p):

I (X n1 ;Sn) = H(Sn)− H(Sn|X n

1 )

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H(bin(k , 1− p) + bin(n − k , p))

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H

(bin(n

2, p))

= log(n + 1)− 1

2log(πep(1− p)n) + O

(1

n

)Hence, we have R ≤ lim

n→∞I (X n

1 ; Sn)/log(n) = 12 .

Proposition (BSC Converse)

Cperm(BSC(p)) ≤ 1

2

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 21 / 39

Page 106: Capacity of Permutation Channels - cs.purdue.edu

BSC Converse Proof: CLT Approximation

Using [CT06, Problem 2.14], i.e., max{H(X ),H(Y )} ≤ H(X + Y ) for X ⊥⊥ Y ,

I (X n1 ;Sn) = H(Sn)− H(Sn|X n

1 )

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H(bin(k , 1− p) + bin(n − k , p))

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H

(bin(n

2, p))

= log(n + 1)− 1

2log(πep(1− p)n) + O

(1

n

)Hence, we have R ≤ lim

n→∞I (X n

1 ; Sn)/log(n) = 12 .

Proposition (BSC Converse)

Cperm(BSC(p)) ≤ 1

2

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 21 / 39

Page 107: Capacity of Permutation Channels - cs.purdue.edu

BSC Converse Proof: CLT Approximation

Approximate binomial entropy using CLT [ALY10]:

I (X n1 ;Sn) = H(Sn)− H(Sn|X n

1 )

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H(bin(k , 1− p) + bin(n − k , p))

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H

(bin(n

2, p))

= log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )

(1

2log(πep(1− p)n) + O

(1

n

))

= log(n + 1)− 1

2log(πep(1− p)n) + O

(1

n

)Hence, we have R ≤ lim

n→∞I (X n

1 ; Sn)/log(n) = 12 .

Proposition (BSC Converse)

Cperm(BSC(p)) ≤ 1

2

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 21 / 39

Page 108: Capacity of Permutation Channels - cs.purdue.edu

BSC Converse Proof: CLT Approximation

Upper bound on I (X n1 ;Sn):

I (X n1 ;Sn) = H(Sn)− H(Sn|X n

1 )

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H(bin(k , 1− p) + bin(n − k , p))

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H

(bin(n

2, p))

= log(n + 1)− 1

2log(πep(1− p)n) + O

(1

n

)

Hence, we have R ≤ limn→∞

I (X n1 ; Sn)/log(n) = 1

2 .

Proposition (BSC Converse)

Cperm(BSC(p)) ≤ 1

2

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 21 / 39

Page 109: Capacity of Permutation Channels - cs.purdue.edu

BSC Converse Proof: CLT Approximation

Upper bound on I (X n1 ;Sn):

I (X n1 ;Sn) = H(Sn)− H(Sn|X n

1 )

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H(bin(k , 1− p) + bin(n − k , p))

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H

(bin(n

2, p))

= log(n + 1)− 1

2log(πep(1− p)n) + O

(1

n

)Hence, we have R ≤ lim

n→∞I (X n

1 ; Sn)/log(n) = 12 .

Proposition (BSC Converse)

Cperm(BSC(p)) ≤ 1

2

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 21 / 39

Page 110: Capacity of Permutation Channels - cs.purdue.edu

BSC Converse Proof: CLT Approximation

Upper bound on I (X n1 ;Sn):

I (X n1 ;Sn) = H(Sn)− H(Sn|X n

1 )

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H(bin(k , 1− p) + bin(n − k , p))

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H

(bin(n

2, p))

= log(n + 1)− 1

2log(πep(1− p)n) + O

(1

n

)Hence, we have R ≤ lim

n→∞I (X n

1 ; Sn)/log(n) = 12 .

Proposition (BSC Converse)

Cperm(BSC(p)) ≤ 1

2

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 21 / 39

Page 111: Capacity of Permutation Channels - cs.purdue.edu

Information Capacity of the BSC Permutation Channel

Proposition (Pemutation Channel Capacity of BSC)

Cperm(BSC(p)) =

1, for p = 0, 112 , for p ∈

(0, 12)∪(12 , 1)

0, for p = 12

𝑝0

𝐶perm BSC 𝑝

0

1

112

12

Remarks:

Cperm(·) is discontinuous andnon-convex

Cperm(·) is generally agnostic toparameters of channel

Computationally tractable codingscheme in achievability proof

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 22 / 39

Page 112: Capacity of Permutation Channels - cs.purdue.edu

Information Capacity of the BSC Permutation Channel

Proposition (Pemutation Channel Capacity of BSC)

Cperm(BSC(p)) =

1, for p = 0, 112 , for p ∈

(0, 12)∪(12 , 1)

0, for p = 12

𝑝0

𝐶perm BSC 𝑝

0

1

112

12

Remarks:

Cperm(·) is discontinuous andnon-convex

Cperm(·) is generally agnostic toparameters of channel

Computationally tractable codingscheme in achievability proof

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 22 / 39

Page 113: Capacity of Permutation Channels - cs.purdue.edu

Information Capacity of the BSC Permutation Channel

Proposition (Pemutation Channel Capacity of BSC)

Cperm(BSC(p)) =

1, for p = 0, 112 , for p ∈

(0, 12)∪(12 , 1)

0, for p = 12

𝑝0

𝐶perm BSC 𝑝

0

1

112

12

Remarks:

Cperm(·) is discontinuous andnon-convex

Cperm(·) is generally agnostic toparameters of channel

Computationally tractable codingscheme in achievability proof

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 22 / 39

Page 114: Capacity of Permutation Channels - cs.purdue.edu

Information Capacity of the BSC Permutation Channel

Proposition (Pemutation Channel Capacity of BSC)

Cperm(BSC(p)) =

1, for p = 0, 112 , for p ∈

(0, 12)∪(12 , 1)

0, for p = 12

𝑝0

𝐶perm BSC 𝑝

0

1

112

12

Remarks:

Cperm(·) is discontinuous andnon-convex

Cperm(·) is generally agnostic toparameters of channel

Computationally tractable codingscheme in achievability proof

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 22 / 39

Page 115: Capacity of Permutation Channels - cs.purdue.edu

Outline

1 Introduction

2 Achievability and Converse for the BSC

3 General Achievability BoundCoding SchemeRank Bound

4 General Converse Bounds

5 Conclusion

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 23 / 39

Page 116: Capacity of Permutation Channels - cs.purdue.edu

Recall General Problem

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Average probability of error Pnerror , P(M 6= M)

“Rate” of coding scheme (fn, gn) is R ,log(|M|)

log(n)

Rate R ≥ 0 is achievable ⇔ ∃{(fn, gn)}n∈N such that limn→∞

Pnerror = 0

Definition (Permutation Channel Capacity)

Cperm(PZ |X ) , sup{R ≥ 0 : R is achievable}

Main Question

What is the permutation channel capacity of a general PZ |X?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 24 / 39

Page 117: Capacity of Permutation Channels - cs.purdue.edu

Achievability: Coding Scheme

Let r = rank(PZ |X ) and k =⌊√

n⌋

Consider X ′ ⊆ X with |X ′| = r such that {PZ |X (·|x) : x ∈ X ′} are linearly independent

Message set:

M ,

{p = (p(x) : x ∈ X ′) ∈ (Z+)X

′:∑x∈X ′

p(x) = k

}

where |M| =(k+r−1

r−1)

= Θ(n

r−12

)

Randomized Encoder:

∀p ∈M, fn(p) = X n1

i.i.d.∼ PX where PX (x) =

{p(x)k , for x ∈ X ′

0, for x ∈ X\X ′

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 25 / 39

Page 118: Capacity of Permutation Channels - cs.purdue.edu

Achievability: Coding Scheme

Let r = rank(PZ |X ) and k =⌊√

n⌋

Consider X ′ ⊆ X with |X ′| = r such that {PZ |X (·|x) : x ∈ X ′} are linearly independent

Message set:

M ,

{p = (p(x) : x ∈ X ′) ∈ (Z+)X

′:∑x∈X ′

p(x) = k

}

where |M| =(k+r−1

r−1)

= Θ(n

r−12

)

Randomized Encoder:

∀p ∈M, fn(p) = X n1

i.i.d.∼ PX where PX (x) =

{p(x)k , for x ∈ X ′

0, for x ∈ X\X ′

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 25 / 39

Page 119: Capacity of Permutation Channels - cs.purdue.edu

Achievability: Coding Scheme

Let r = rank(PZ |X ) and k =⌊√

n⌋

Consider X ′ ⊆ X with |X ′| = r such that {PZ |X (·|x) : x ∈ X ′} are linearly independent

Message set:

M ,

{p = (p(x) : x ∈ X ′) ∈ (Z+)X

′:∑x∈X ′

p(x) = k

}

where |M| =(k+r−1

r−1)

= Θ(n

r−12

)Randomized Encoder:

∀p ∈M, fn(p) = X n1

i.i.d.∼ PX where PX (x) =

{p(x)k , for x ∈ X ′

0, for x ∈ X\X ′

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 25 / 39

Page 120: Capacity of Permutation Channels - cs.purdue.edu

Achievability: Coding Scheme

Let r = rank(PZ |X ) and k =⌊√

n⌋

Consider X ′ ⊆ X with |X ′| = r such that {PZ |X (·|x) : x ∈ X ′} are linearly independent

Message set:

M ,

{p = (p(x) : x ∈ X ′) ∈ (Z+)X

′:∑x∈X ′

p(x) = k

}

where |M| =(k+r−1

r−1)

= Θ(n

r−12

)

Randomized Encoder:

∀p ∈M, fn(p) = X n1

i.i.d.∼ PX where PX (x) =

{p(x)k , for x ∈ X ′

0, for x ∈ X\X ′

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 25 / 39

Page 121: Capacity of Permutation Channels - cs.purdue.edu

Achievability: Coding Scheme

Let r = rank(PZ |X ) and k =⌊√

n⌋

Consider X ′ ⊆ X with |X ′| = r such that {PZ |X (·|x) : x ∈ X ′} are linearly independent

Message set:

M ,

{p = (p(x) : x ∈ X ′) ∈ (Z+)X

′:∑x∈X ′

p(x) = k

}

where |M| =(k+r−1

r−1)

= Θ(n

r−12

)Randomized Encoder:

∀p ∈M, fn(p) = X n1

i.i.d.∼ PX where PX (x) =

{p(x)k , for x ∈ X ′

0, for x ∈ X\X ′

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 25 / 39

Page 122: Capacity of Permutation Channels - cs.purdue.edu

Achievability: Coding Scheme

Let stochastic matrix PZ |X ∈ Rr×|Y| have rows {PZ |X (·|x) : x ∈ X ′}Let P†Z |X denote its Moore-Penrose pseudoinverse

(Sub-optimal) Thresholding Decoder: For any yn1 ∈ Yn,Step 1: Construct its type/empirical distribution/histogram

∀y ∈ Y, Pyn1

(y) =1

n

n∑i=1

1{yi = y}

Step 2: Generate estimate p ∈ (Z+)X′

with components

∀x ∈ X ′, p(x) = arg minj∈{0,...,k}

∣∣∣∣∣∣∑y∈Y

Pyn1

(y)[P†Z |X

]y ,x− j

k

∣∣∣∣∣∣Step 3: Output decoded message

gn(yn1 ) =

{p, if p ∈Merror, otherwise

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 26 / 39

Page 123: Capacity of Permutation Channels - cs.purdue.edu

Achievability: Coding Scheme

Let stochastic matrix PZ |X ∈ Rr×|Y| have rows {PZ |X (·|x) : x ∈ X ′}Let P†Z |X denote its Moore-Penrose pseudoinverse

(Sub-optimal) Thresholding Decoder: For any yn1 ∈ Yn,Step 1: Construct its type/empirical distribution/histogram

∀y ∈ Y, Pyn1

(y) =1

n

n∑i=1

1{yi = y}

Step 2: Generate estimate p ∈ (Z+)X′

with components

∀x ∈ X ′, p(x) = arg minj∈{0,...,k}

∣∣∣∣∣∣∑y∈Y

Pyn1

(y)[P†Z |X

]y ,x− j

k

∣∣∣∣∣∣Step 3: Output decoded message

gn(yn1 ) =

{p, if p ∈Merror, otherwise

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 26 / 39

Page 124: Capacity of Permutation Channels - cs.purdue.edu

Achievability: Coding Scheme

Let stochastic matrix PZ |X ∈ Rr×|Y| have rows {PZ |X (·|x) : x ∈ X ′}Let P†Z |X denote its Moore-Penrose pseudoinverse

(Sub-optimal) Thresholding Decoder: For any yn1 ∈ Yn,Step 1: Construct its type/empirical distribution/histogram

∀y ∈ Y, Pyn1

(y) =1

n

n∑i=1

1{yi = y}

Step 2: Generate estimate p ∈ (Z+)X′

with components

∀x ∈ X ′, p(x) = arg minj∈{0,...,k}

∣∣∣∣∣∣∑y∈Y

Pyn1

(y)[P†Z |X

]y ,x− j

k

∣∣∣∣∣∣

Step 3: Output decoded message

gn(yn1 ) =

{p, if p ∈Merror, otherwise

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 26 / 39

Page 125: Capacity of Permutation Channels - cs.purdue.edu

Achievability: Coding Scheme

Let stochastic matrix PZ |X ∈ Rr×|Y| have rows {PZ |X (·|x) : x ∈ X ′}Let P†Z |X denote its Moore-Penrose pseudoinverse

(Sub-optimal) Thresholding Decoder: For any yn1 ∈ Yn,Step 1: Construct its type/empirical distribution/histogram

∀y ∈ Y, Pyn1

(y) =1

n

n∑i=1

1{yi = y}

Step 2: Generate estimate p ∈ (Z+)X′

with components

∀x ∈ X ′, p(x) = arg minj∈{0,...,k}

∣∣∣∣∣∣∑y∈Y

Pyn1

(y)[P†Z |X

]y ,x− j

k

∣∣∣∣∣∣Step 3: Output decoded message

gn(yn1 ) =

{p, if p ∈Merror, otherwise

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 26 / 39

Page 126: Capacity of Permutation Channels - cs.purdue.edu

Achievability: Rank Bound

Theorem (Rank Bound)

For any channel PZ |X :

Cperm(PZ |X ) ≥rank(PZ |X )− 1

2.

Remarks about Coding Scheme:

Showing limn→∞ Pnerror = 0 proves theorem.

Intuition: Conditioned on M = p, PY n1≈ PZ with high probability as n→∞.

Hence,∑

y∈Y PY n1

(y)[P†Z |X

]y ,x≈ PX (x) for all x ∈ X ′ with high probability.

Computational complexity: Decoder has O(n) running time.

Probabilistic method: Good deterministic codes exist.

Expurgation: Achievability bound holds under maximal probability of error criterion.

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 27 / 39

Page 127: Capacity of Permutation Channels - cs.purdue.edu

Achievability: Rank Bound

Theorem (Rank Bound)

For any channel PZ |X :

Cperm(PZ |X ) ≥rank(PZ |X )− 1

2.

Remarks about Coding Scheme:

Showing limn→∞ Pnerror = 0 proves theorem.

Intuition: Conditioned on M = p, PY n1≈ PZ with high probability as n→∞.

Hence,∑

y∈Y PY n1

(y)[P†Z |X

]y ,x≈ PX (x) for all x ∈ X ′ with high probability.

Computational complexity: Decoder has O(n) running time.

Probabilistic method: Good deterministic codes exist.

Expurgation: Achievability bound holds under maximal probability of error criterion.

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 27 / 39

Page 128: Capacity of Permutation Channels - cs.purdue.edu

Achievability: Rank Bound

Theorem (Rank Bound)

For any channel PZ |X :

Cperm(PZ |X ) ≥rank(PZ |X )− 1

2.

Remarks about Coding Scheme:

Showing limn→∞ Pnerror = 0 proves theorem.

Intuition: Conditioned on M = p, PY n1≈ PZ with high probability as n→∞.

Hence,∑

y∈Y PY n1

(y)[P†Z |X

]y ,x≈ PX (x) for all x ∈ X ′ with high probability.

Computational complexity: Decoder has O(n) running time.

Probabilistic method: Good deterministic codes exist.

Expurgation: Achievability bound holds under maximal probability of error criterion.

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 27 / 39

Page 129: Capacity of Permutation Channels - cs.purdue.edu

Achievability: Rank Bound

Theorem (Rank Bound)

For any channel PZ |X :

Cperm(PZ |X ) ≥rank(PZ |X )− 1

2.

Remarks about Coding Scheme:

Showing limn→∞ Pnerror = 0 proves theorem.

Intuition: Conditioned on M = p, PY n1≈ PZ with high probability as n→∞.

Hence,∑

y∈Y PY n1

(y)[P†Z |X

]y ,x≈ PX (x) for all x ∈ X ′ with high probability.

Computational complexity: Decoder has O(n) running time.

Probabilistic method: Good deterministic codes exist.

Expurgation: Achievability bound holds under maximal probability of error criterion.

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 27 / 39

Page 130: Capacity of Permutation Channels - cs.purdue.edu

Achievability: Rank Bound

Theorem (Rank Bound)

For any channel PZ |X :

Cperm(PZ |X ) ≥rank(PZ |X )− 1

2.

Remarks about Coding Scheme:

Showing limn→∞ Pnerror = 0 proves theorem.

Intuition: Conditioned on M = p, PY n1≈ PZ with high probability as n→∞.

Hence,∑

y∈Y PY n1

(y)[P†Z |X

]y ,x≈ PX (x) for all x ∈ X ′ with high probability.

Computational complexity: Decoder has O(n) running time.

Probabilistic method: Good deterministic codes exist.

Expurgation: Achievability bound holds under maximal probability of error criterion.

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 27 / 39

Page 131: Capacity of Permutation Channels - cs.purdue.edu

Achievability: Rank Bound

Theorem (Rank Bound)

For any channel PZ |X :

Cperm(PZ |X ) ≥rank(PZ |X )− 1

2.

Remarks about Coding Scheme:

Showing limn→∞ Pnerror = 0 proves theorem.

Intuition: Conditioned on M = p, PY n1≈ PZ with high probability as n→∞.

Hence,∑

y∈Y PY n1

(y)[P†Z |X

]y ,x≈ PX (x) for all x ∈ X ′ with high probability.

Computational complexity: Decoder has O(n) running time.

Probabilistic method: Good deterministic codes exist.

Expurgation: Achievability bound holds under maximal probability of error criterion.

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 27 / 39

Page 132: Capacity of Permutation Channels - cs.purdue.edu

Outline

1 Introduction

2 Achievability and Converse for the BSC

3 General Achievability Bound

4 General Converse BoundsOutput Alphabet BoundEffective Input Alphabet BoundDegradation by Symmetric Channels

5 Conclusion

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 28 / 39

Page 133: Capacity of Permutation Channels - cs.purdue.edu

Converse: Output Alphabet Bound

Theorem (Output Alphabet Bound)

For any entry-wise strictly positive channel PZ |X > 0:

Cperm(PZ |X ) ≤ |Y| − 1

2.

Remarks:

Proof hinges on Fano’s inequality and CLT approximation of binomial entropy.

What if |X | is much smaller than |Y|?Want: Converse bound in terms of input alphabet size.

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 29 / 39

Page 134: Capacity of Permutation Channels - cs.purdue.edu

Converse: Output Alphabet Bound

Theorem (Output Alphabet Bound)

For any entry-wise strictly positive channel PZ |X > 0:

Cperm(PZ |X ) ≤ |Y| − 1

2.

Remarks:

Proof hinges on Fano’s inequality and CLT approximation of binomial entropy.

What if |X | is much smaller than |Y|?Want: Converse bound in terms of input alphabet size.

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 29 / 39

Page 135: Capacity of Permutation Channels - cs.purdue.edu

Converse: Output Alphabet Bound

Theorem (Output Alphabet Bound)

For any entry-wise strictly positive channel PZ |X > 0:

Cperm(PZ |X ) ≤ |Y| − 1

2.

Remarks:

Proof hinges on Fano’s inequality and CLT approximation of binomial entropy.

What if |X | is much smaller than |Y|?

Want: Converse bound in terms of input alphabet size.

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 29 / 39

Page 136: Capacity of Permutation Channels - cs.purdue.edu

Converse: Output Alphabet Bound

Theorem (Output Alphabet Bound)

For any entry-wise strictly positive channel PZ |X > 0:

Cperm(PZ |X ) ≤ |Y| − 1

2.

Remarks:

Proof hinges on Fano’s inequality and CLT approximation of binomial entropy.

What if |X | is much smaller than |Y|?Want: Converse bound in terms of input alphabet size.

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 29 / 39

Page 137: Capacity of Permutation Channels - cs.purdue.edu

Converse: Effective Input Alphabet Bound

Theorem (Effective Input Alphabet Bound)

For any entry-wise strictly positive channel PZ |X > 0:

Cperm(PZ |X ) ≤ext(PZ |X )− 1

2

where ext(PZ |X ) denotes the number of extreme points of conv{PZ |X (·|x) : x ∈ X

}.

Remarks:

Effective input alphabet size: rank(PZ |X ) ≤ ext(PZ |X ) ≤ |X |.For any channel PZ |X > 0, Cperm(PZ |X ) ≤

(min{ext(PZ |X ), |Y|} − 1

)/2.

For any general channel PZ |X , Cperm(PZ |X ) ≤ min{ext(PZ |X ), |Y|} − 1.

How do we prove above theorem?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 30 / 39

Page 138: Capacity of Permutation Channels - cs.purdue.edu

Converse: Effective Input Alphabet Bound

Theorem (Effective Input Alphabet Bound)

For any entry-wise strictly positive channel PZ |X > 0:

Cperm(PZ |X ) ≤ext(PZ |X )− 1

2

where ext(PZ |X ) denotes the number of extreme points of conv{PZ |X (·|x) : x ∈ X

}.

Remarks:

Effective input alphabet size: rank(PZ |X ) ≤ ext(PZ |X ) ≤ |X |.

For any channel PZ |X > 0, Cperm(PZ |X ) ≤(min{ext(PZ |X ), |Y|} − 1

)/2.

For any general channel PZ |X , Cperm(PZ |X ) ≤ min{ext(PZ |X ), |Y|} − 1.

How do we prove above theorem?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 30 / 39

Page 139: Capacity of Permutation Channels - cs.purdue.edu

Converse: Effective Input Alphabet Bound

Theorem (Effective Input Alphabet Bound)

For any entry-wise strictly positive channel PZ |X > 0:

Cperm(PZ |X ) ≤ext(PZ |X )− 1

2

where ext(PZ |X ) denotes the number of extreme points of conv{PZ |X (·|x) : x ∈ X

}.

Remarks:

Effective input alphabet size: rank(PZ |X ) ≤ ext(PZ |X ) ≤ |X |.For any channel PZ |X > 0, Cperm(PZ |X ) ≤

(min{ext(PZ |X ), |Y|} − 1

)/2.

For any general channel PZ |X , Cperm(PZ |X ) ≤ min{ext(PZ |X ), |Y|} − 1.

How do we prove above theorem?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 30 / 39

Page 140: Capacity of Permutation Channels - cs.purdue.edu

Converse: Effective Input Alphabet Bound

Theorem (Effective Input Alphabet Bound)

For any entry-wise strictly positive channel PZ |X > 0:

Cperm(PZ |X ) ≤ext(PZ |X )− 1

2

where ext(PZ |X ) denotes the number of extreme points of conv{PZ |X (·|x) : x ∈ X

}.

Remarks:

Effective input alphabet size: rank(PZ |X ) ≤ ext(PZ |X ) ≤ |X |.For any channel PZ |X > 0, Cperm(PZ |X ) ≤

(min{ext(PZ |X ), |Y|} − 1

)/2.

For any general channel PZ |X , Cperm(PZ |X ) ≤ min{ext(PZ |X ), |Y|} − 1.

How do we prove above theorem?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 30 / 39

Page 141: Capacity of Permutation Channels - cs.purdue.edu

Converse: Effective Input Alphabet Bound

Theorem (Effective Input Alphabet Bound)

For any entry-wise strictly positive channel PZ |X > 0:

Cperm(PZ |X ) ≤ext(PZ |X )− 1

2

where ext(PZ |X ) denotes the number of extreme points of conv{PZ |X (·|x) : x ∈ X

}.

Remarks:

Effective input alphabet size: rank(PZ |X ) ≤ ext(PZ |X ) ≤ |X |.For any channel PZ |X > 0, Cperm(PZ |X ) ≤

(min{ext(PZ |X ), |Y|} − 1

)/2.

For any general channel PZ |X , Cperm(PZ |X ) ≤ min{ext(PZ |X ), |Y|} − 1.

How do we prove above theorem?

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 30 / 39

Page 142: Capacity of Permutation Channels - cs.purdue.edu

Brief Digression: Degradation

Definition (Degradation/Blackwell Order [Bla51], [She51], [Ste51], [Cov72], [Ber73])

Given channels PZ1|X and PZ2|X with common input alphabet X , PZ2|X is a degraded versionof PZ1|X if PZ2|X = PZ1|XPZ2|Z1

for some channel PZ2|Z1.

Theorem (Blackwell-Sherman-Stein [Bla51], [She51], [Ste51])

The observation model PZ2|X is a degraded version of PZ1|X if and only if for every priordistribution PX , and every loss function L : X × X → R, the Bayes risks satisfy:

minf (·)

E [L(X , f (Z1))] ≤ ming(·)

E [L(X , g(Z2))]

where the minima are over all randomized estimators of X .

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 31 / 39

Page 143: Capacity of Permutation Channels - cs.purdue.edu

Brief Digression: Degradation

Definition (Degradation/Blackwell Order [Bla51], [She51], [Ste51], [Cov72], [Ber73])

Given channels PZ1|X and PZ2|X with common input alphabet X , PZ2|X is a degraded versionof PZ1|X if PZ2|X = PZ1|XPZ2|Z1

for some channel PZ2|Z1.

Theorem (Blackwell-Sherman-Stein [Bla51], [She51], [Ste51])

The observation model PZ2|X is a degraded version of PZ1|X if and only if for every priordistribution PX , and every loss function L : X × X → R, the Bayes risks satisfy:

minf (·)

E [L(X , f (Z1))] ≤ ming(·)

E [L(X , g(Z2))]

where the minima are over all randomized estimators of X .

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 31 / 39

Page 144: Capacity of Permutation Channels - cs.purdue.edu

Brief Digression: Symmetric Channels

Definition (q-ary Symmetric Channel)

A q-ary symmetric channel, denoted q-SC(δ), with total crossover probability δ ∈ [0, 1] andalphabet X where |X | = q, is given by the doubly stochastic matrix:

Wδ ,

1− δ δ

q−1 · · · δq−1

δq−1 1− δ · · · δ

q−1...

.... . .

...δ

q−1δ

q−1 · · · 1− δ

.

Proposition (Degradation by Symmetric Channels)

Given channel PZ |X with ν = minx∈X , y∈Y

PZ |X (y |x),

if 0 ≤ δ ≤ ν

1− ν + νq−1

, then PZ |X is a degraded version of q-SC(δ).

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 32 / 39

Page 145: Capacity of Permutation Channels - cs.purdue.edu

Brief Digression: Symmetric Channels

Definition (q-ary Symmetric Channel)

A q-ary symmetric channel, denoted q-SC(δ), with total crossover probability δ ∈ [0, 1] andalphabet X where |X | = q, is given by the doubly stochastic matrix:

Wδ ,

1− δ δ

q−1 · · · δq−1

δq−1 1− δ · · · δ

q−1...

.... . .

...δ

q−1δ

q−1 · · · 1− δ

.

Proposition (Degradation by Symmetric Channels)

Given channel PZ |X with ν = minx∈X , y∈Y

PZ |X (y |x),

if 0 ≤ δ ≤ ν

1− ν + νq−1

, then PZ |X is a degraded version of q-SC(δ).

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 32 / 39

Page 146: Capacity of Permutation Channels - cs.purdue.edu

Brief Digression: Symmetric Channels

Proposition (Degradation by Symmetric Channels)

Given channel PZ |X with ν = minx∈X , y∈Y

PZ |X (y |x),

if 0 ≤ δ ≤ ν

1− ν + νq−1

, then PZ |X is a degraded version of q-SC(δ).

Remarks:

Prop follows from computing extremal δ such that W−1δ PZ |X is row stochastic.

Bound on δ can be improved when more is known about PZ |X :

Markov chain [MP18]: δ ≤ ν/(1− (q − 1)ν + ν

q−1).

Additive noise channel on Abelian group X [MP18]: δ ≤ (q − 1)ν.Alternative bounds for Markov chains [MOS13].

Many applications in information theory, statistics, and probability [MP18], [MOS13].

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 33 / 39

Page 147: Capacity of Permutation Channels - cs.purdue.edu

Brief Digression: Symmetric Channels

Proposition (Degradation by Symmetric Channels)

Given channel PZ |X with ν = minx∈X , y∈Y

PZ |X (y |x),

if 0 ≤ δ ≤ ν

1− ν + νq−1

, then PZ |X is a degraded version of q-SC(δ).

Remarks:

Prop follows from computing extremal δ such that W−1δ PZ |X is row stochastic.

Bound on δ can be improved when more is known about PZ |X :

Markov chain [MP18]: δ ≤ ν/(1− (q − 1)ν + ν

q−1).

Additive noise channel on Abelian group X [MP18]: δ ≤ (q − 1)ν.Alternative bounds for Markov chains [MOS13].

Many applications in information theory, statistics, and probability [MP18], [MOS13].

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 33 / 39

Page 148: Capacity of Permutation Channels - cs.purdue.edu

Brief Digression: Symmetric Channels

Proposition (Degradation by Symmetric Channels)

Given channel PZ |X with ν = minx∈X , y∈Y

PZ |X (y |x),

if 0 ≤ δ ≤ ν

1− ν + νq−1

, then PZ |X is a degraded version of q-SC(δ).

Remarks:

Prop follows from computing extremal δ such that W−1δ PZ |X is row stochastic.

Bound on δ can be improved when more is known about PZ |X :

Markov chain [MP18]: δ ≤ ν/(1− (q − 1)ν + ν

q−1).

Additive noise channel on Abelian group X [MP18]: δ ≤ (q − 1)ν.Alternative bounds for Markov chains [MOS13].

Many applications in information theory, statistics, and probability [MP18], [MOS13].

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 33 / 39

Page 149: Capacity of Permutation Channels - cs.purdue.edu

Brief Digression: Symmetric Channels

Proposition (Degradation by Symmetric Channels)

Given channel PZ |X with ν = minx∈X , y∈Y

PZ |X (y |x),

if 0 ≤ δ ≤ ν

1− ν + νq−1

, then PZ |X is a degraded version of q-SC(δ).

Remarks:

Prop follows from computing extremal δ such that W−1δ PZ |X is row stochastic.

Bound on δ can be improved when more is known about PZ |X :

Markov chain [MP18]: δ ≤ ν/(1− (q − 1)ν + ν

q−1).

Additive noise channel on Abelian group X [MP18]: δ ≤ (q − 1)ν.

Alternative bounds for Markov chains [MOS13].

Many applications in information theory, statistics, and probability [MP18], [MOS13].

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 33 / 39

Page 150: Capacity of Permutation Channels - cs.purdue.edu

Brief Digression: Symmetric Channels

Proposition (Degradation by Symmetric Channels)

Given channel PZ |X with ν = minx∈X , y∈Y

PZ |X (y |x),

if 0 ≤ δ ≤ ν

1− ν + νq−1

, then PZ |X is a degraded version of q-SC(δ).

Remarks:

Prop follows from computing extremal δ such that W−1δ PZ |X is row stochastic.

Bound on δ can be improved when more is known about PZ |X :

Markov chain [MP18]: δ ≤ ν/(1− (q − 1)ν + ν

q−1).

Additive noise channel on Abelian group X [MP18]: δ ≤ (q − 1)ν.Alternative bounds for Markov chains [MOS13].

Many applications in information theory, statistics, and probability [MP18], [MOS13].

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 33 / 39

Page 151: Capacity of Permutation Channels - cs.purdue.edu

Brief Digression: Symmetric Channels

Proposition (Degradation by Symmetric Channels)

Given channel PZ |X with ν = minx∈X , y∈Y

PZ |X (y |x),

if 0 ≤ δ ≤ ν

1− ν + νq−1

, then PZ |X is a degraded version of q-SC(δ).

Remarks:

Prop follows from computing extremal δ such that W−1δ PZ |X is row stochastic.

Bound on δ can be improved when more is known about PZ |X :

Markov chain [MP18]: δ ≤ ν/(1− (q − 1)ν + ν

q−1).

Additive noise channel on Abelian group X [MP18]: δ ≤ (q − 1)ν.Alternative bounds for Markov chains [MOS13].

Many applications in information theory, statistics, and probability [MP18], [MOS13].

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 33 / 39

Page 152: Capacity of Permutation Channels - cs.purdue.edu

Proof Idea: Degradation by Symmetric Channels

Theorem (Effective Input Alphabet Bound)

For any entry-wise strictly positive channel PZ |X > 0:

Cperm(PZ |X ) ≤ext(PZ |X )− 1

2.

Proof Sketch:

Degradation by symmetric channels + tensorization of degradation + data processing

⇒ I (X n1 ;Y n

1 ) ≤ I (X n1 ; Y n

1 )

where Y n1 and Y n

1 are outputs of permutation channels with PZ |X and q-SC(δ), resp.

Convexity of KL divergence ⇒ Reduce |X | to ext(PZ |X ).

Fano argument of output alphabet bound ⇒ effective input alphabet bound.

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 34 / 39

Page 153: Capacity of Permutation Channels - cs.purdue.edu

Proof Idea: Degradation by Symmetric Channels

Theorem (Effective Input Alphabet Bound)

For any entry-wise strictly positive channel PZ |X > 0:

Cperm(PZ |X ) ≤ext(PZ |X )− 1

2.

Proof Sketch:

Degradation by symmetric channels + tensorization of degradation + data processing

⇒ I (X n1 ;Y n

1 ) ≤ I (X n1 ; Y n

1 )

where Y n1 and Y n

1 are outputs of permutation channels with PZ |X and q-SC(δ), resp.

Convexity of KL divergence ⇒ Reduce |X | to ext(PZ |X ).

Fano argument of output alphabet bound ⇒ effective input alphabet bound.

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 34 / 39

Page 154: Capacity of Permutation Channels - cs.purdue.edu

Proof Idea: Degradation by Symmetric Channels

Theorem (Effective Input Alphabet Bound)

For any entry-wise strictly positive channel PZ |X > 0:

Cperm(PZ |X ) ≤ext(PZ |X )− 1

2.

Proof Sketch:

Degradation by symmetric channels + tensorization of degradation + data processing

⇒ I (X n1 ;Y n

1 ) ≤ I (X n1 ; Y n

1 )

where Y n1 and Y n

1 are outputs of permutation channels with PZ |X and q-SC(δ), resp.

Convexity of KL divergence ⇒ Reduce |X | to ext(PZ |X ).

Fano argument of output alphabet bound ⇒ effective input alphabet bound.

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 34 / 39

Page 155: Capacity of Permutation Channels - cs.purdue.edu

Outline

1 Introduction

2 Achievability and Converse for the BSC

3 General Achievability Bound

4 General Converse Bounds

5 ConclusionStrictly Positive and “Full Rank” Channels

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 35 / 39

Page 156: Capacity of Permutation Channels - cs.purdue.edu

Strictly Positive and “Full Rank” Channels

Achievability and converse bounds yield:

Theorem (Strictly Positive and “Full Rank” Channels)

For any entry-wise strictly positive channel PZ |X > 0 that is “full rank” in the sense that

r , rank(PZ |X ) = min{ext(PZ |X ), |Y|}:

Cperm(PZ |X ) =r − 1

2.

Recall Example: Cperm of non-trivial binary symmetric channel is 12 .

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 36 / 39

Page 157: Capacity of Permutation Channels - cs.purdue.edu

Strictly Positive and “Full Rank” Channels

Achievability and converse bounds yield:

Theorem (Strictly Positive and “Full Rank” Channels)

For any entry-wise strictly positive channel PZ |X > 0 that is “full rank” in the sense that

r , rank(PZ |X ) = min{ext(PZ |X ), |Y|}:

Cperm(PZ |X ) =r − 1

2.

Recall Example: Cperm of non-trivial binary symmetric channel is 12 .

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 36 / 39

Page 158: Capacity of Permutation Channels - cs.purdue.edu

Conclusion

Main Result:For any entry-wise strictly positive channel PZ |X > 0:

rank(PZ |X )− 1

2≤ Cperm(PZ |X ) ≤

min{ext(PZ |X ), |Y|} − 1

2.

Future Directions:

Characterize Cperm of all (entry-wise strictly positive) channels.

Perform error exponent analysis (i.e., tight bounds on Pnerror).

Prove strong converse results (i.e., phase transition for Pnerror).

Perform finite blocklength analysis (i.e., exact asymptotics for maximum achievable |M|).

Analyze permutation channels with more complex probability models in the randompermutation block.

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 37 / 39

Page 159: Capacity of Permutation Channels - cs.purdue.edu

Conclusion

Main Result:For any entry-wise strictly positive channel PZ |X > 0:

rank(PZ |X )− 1

2≤ Cperm(PZ |X ) ≤

min{ext(PZ |X ), |Y|} − 1

2.

Future Directions:

Characterize Cperm of all (entry-wise strictly positive) channels.

Perform error exponent analysis (i.e., tight bounds on Pnerror).

Prove strong converse results (i.e., phase transition for Pnerror).

Perform finite blocklength analysis (i.e., exact asymptotics for maximum achievable |M|).

Analyze permutation channels with more complex probability models in the randompermutation block.

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 37 / 39

Page 160: Capacity of Permutation Channels - cs.purdue.edu

Conclusion

Main Result:For any entry-wise strictly positive channel PZ |X > 0:

rank(PZ |X )− 1

2≤ Cperm(PZ |X ) ≤

min{ext(PZ |X ), |Y|} − 1

2.

Future Directions:

Characterize Cperm of all (entry-wise strictly positive) channels.

Perform error exponent analysis (i.e., tight bounds on Pnerror).

Prove strong converse results (i.e., phase transition for Pnerror).

Perform finite blocklength analysis (i.e., exact asymptotics for maximum achievable |M|).

Analyze permutation channels with more complex probability models in the randompermutation block.

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 37 / 39

Page 161: Capacity of Permutation Channels - cs.purdue.edu

Conclusion

Main Result:For any entry-wise strictly positive channel PZ |X > 0:

rank(PZ |X )− 1

2≤ Cperm(PZ |X ) ≤

min{ext(PZ |X ), |Y|} − 1

2.

Future Directions:

Characterize Cperm of all (entry-wise strictly positive) channels.

Perform error exponent analysis (i.e., tight bounds on Pnerror).

Prove strong converse results (i.e., phase transition for Pnerror).

Perform finite blocklength analysis (i.e., exact asymptotics for maximum achievable |M|).

Analyze permutation channels with more complex probability models in the randompermutation block.

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 37 / 39

Page 162: Capacity of Permutation Channels - cs.purdue.edu

Conclusion

Main Result:For any entry-wise strictly positive channel PZ |X > 0:

rank(PZ |X )− 1

2≤ Cperm(PZ |X ) ≤

min{ext(PZ |X ), |Y|} − 1

2.

Future Directions:

Characterize Cperm of all (entry-wise strictly positive) channels.

Perform error exponent analysis (i.e., tight bounds on Pnerror).

Prove strong converse results (i.e., phase transition for Pnerror).

Perform finite blocklength analysis (i.e., exact asymptotics for maximum achievable |M|).

Analyze permutation channels with more complex probability models in the randompermutation block.

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 37 / 39

Page 163: Capacity of Permutation Channels - cs.purdue.edu

Conclusion

Main Result:For any entry-wise strictly positive channel PZ |X > 0:

rank(PZ |X )− 1

2≤ Cperm(PZ |X ) ≤

min{ext(PZ |X ), |Y|} − 1

2.

Future Directions:

Characterize Cperm of all (entry-wise strictly positive) channels.

Perform error exponent analysis (i.e., tight bounds on Pnerror).

Prove strong converse results (i.e., phase transition for Pnerror).

Perform finite blocklength analysis (i.e., exact asymptotics for maximum achievable |M|).

Analyze permutation channels with more complex probability models in the randompermutation block.

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 37 / 39

Page 164: Capacity of Permutation Channels - cs.purdue.edu

References

This talk was based on:

A. Makur, “Information capacity of BSC and BEC permutation channels,” inProceedings of the 56th Annual Allerton Conference on Communication, Control, andComputing, Monticello, IL, USA, October 2-5 2018, pp. 1112–1119.

A. Makur, “Bounds on permutation channel capacity,” in Proceedings of the IEEEInternational Symposium on Information Theory (ISIT), Los Angeles, CA, USA, June21-26 2020.

A. Makur, “Coding theorems for noisy permutation channels,” accepted to IEEETransactions on Information Theory, July 2020.

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 38 / 39

Page 165: Capacity of Permutation Channels - cs.purdue.edu

Thank You!

Anuran Makur (MIT) Capacity of Permutation Channels 7 October 2020 39 / 39