Information Capacity of the BSC Permutation Channel · 2021. 5. 27. · Information Capacity of the...

Preview:

Citation preview

Information Capacity of the BSC Permutation Channel

Anuran Makur

EECS Department, Massachusetts Institute of Technology

Allerton Conference 2018

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 1 / 21

Outline

1 IntroductionMotivation: Coding for Communication NetworksThe Permutation Channel ModelCapacity of the BSC Permutation Channel

2 Achievability

3 Converse

4 Conclusion

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 2 / 21

Motivation: Point-to-point Communication in Networks

NETWORK

SENDER RECEIVER

Model communication network as a channel:

Alphabet symbols = all possible L-bit packetsmultipath routed networkpackets are impaired

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 3 / 21

Motivation: Point-to-point Communication in Networks

NETWORK

SENDER RECEIVER

Model communication network as a channel

:

Alphabet symbols = all possible L-bit packetsmultipath routed networkpackets are impaired

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 3 / 21

Motivation: Point-to-point Communication in Networks

NETWORK

SENDER RECEIVER

Model communication network as a channel:

Alphabet symbols = all possible L-bit packets ⇒ 2L input symbols

multipath routed networkpackets are impaired

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 3 / 21

Motivation: Point-to-point Communication in Networks

NETWORK

SENDER RECEIVER

Model communication network as a channel:

Alphabet symbols = all possible L-bit packetsmultipath routed network or evolving network topology

packets are impaired

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 3 / 21

Motivation: Point-to-point Communication in Networks

NETWORK

SENDER RECEIVER

Model communication network as a channel:

Alphabet symbols = all possible L-bit packetsmultipath routed network ⇒ packets received with transpositions

packets are impaired

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 3 / 21

Motivation: Point-to-point Communication in Networks

NETWORK

SENDER RECEIVER

Model communication network as a channel:

Alphabet symbols = all possible L-bit packetsmultipath routed network ⇒ packets received with transpositionspackets are impaired (e.g. deletions, substitutions)

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 3 / 21

Motivation: Point-to-point Communication in Networks

NETWORK

SENDER RECEIVER

Model communication network as a channel:

Alphabet symbols = all possible L-bit packetsmultipath routed network ⇒ packets received with transpositionspackets are impaired ⇒ model using channel probabilities

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 3 / 21

Example: Coding for Random Deletion Network

Consider a communication network where packets can be dropped:

NETWORK

SENDER RECEIVER

Abstraction:

n-length codeword = sequence of n packets:Random permutation block: Randomly permute packets of codeword

How do you code in such channelswithout increasing alphabet size?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 4 / 21

Example: Coding for Random Deletion Network

Consider a communication network where packets can be dropped:

NETWORK

SENDER RECEIVER

RANDOM DELETION

RANDOM PERMUTATION

Abstraction:

n-length codeword = sequence of n packets

:

Random permutation block: Randomly permute packets of codeword

How do you code in such channelswithout increasing alphabet size?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 4 / 21

Example: Coding for Random Deletion Network

Consider a communication network where packets can be dropped:

NETWORK

SENDER RECEIVER

RANDOM DELETION

RANDOM PERMUTATION

Abstraction:

n-length codeword = sequence of n packetsRandom deletion channel: Delete each symbol/packet of codewordindependently with probability p ∈ (0, 1)

Random permutation block: Randomly permute packets of codeword

How do you code in such channelswithout increasing alphabet size?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 4 / 21

Example: Coding for Random Deletion Network

Consider a communication network where packets can be dropped:

NETWORK

SENDER RECEIVER

RANDOM DELETION

RANDOM PERMUTATION

Abstraction:

n-length codeword = sequence of n packetsRandom deletion channel: Delete each symbol/packet of codewordindependently with probability p ∈ (0, 1)

Random permutation block: Randomly permute packets of codeword

How do you code in such channelswithout increasing alphabet size?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 4 / 21

Example: Coding for Random Deletion Network

Consider a communication network where packets can be dropped:

NETWORK

SENDER RECEIVER

RANDOM DELETION

RANDOM PERMUTATION

Abstraction:

n-length codeword = sequence of n packetsRandom deletion channel: Delete each symbol/packet of codewordindependently with probability p ∈ (0, 1)Random permutation block: Randomly permute packets of codeword

How do you code in such channelswithout increasing alphabet size?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 4 / 21

Example: Coding for Random Deletion Network

Consider a communication network where packets can be dropped:

NETWORK

SENDER RECEIVER

RANDOM DELETION

RANDOM PERMUTATION

Abstraction:

n-length codeword = sequence of n packetsRandom deletion channel: Delete each symbol/packet of codewordindependently with probability p ∈ (0, 1)Random permutation block: Randomly permute packets of codeword

How do you code in such channelswithout increasing alphabet size?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 4 / 21

Example: Coding for Random Deletion Network

Consider a communication network where packets can be dropped:

NETWORK

SENDER RECEIVER

ERASURE CHANNEL

RANDOM PERMUTATION?

?

Abstraction:

n-length codeword = sequence of n packetsEquivalent Erasure channel: Erase each symbol/packet of codewordindependently with probability p ∈ (0, 1)Random permutation block: Randomly permute packets of codeword

How do you code in such channelswithout increasing alphabet size?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 4 / 21

Example: Coding for Random Deletion Network

Consider a communication network where packets can be dropped:

NETWORK

SENDER RECEIVER

ERASURE CHANNEL

RANDOM PERMUTATION?

?1 2 33

31

1

Abstraction:

n-length codeword = sequence of n packetsErasure channel: Erase each symbol/packet of codewordindependently with probability p ∈ (0, 1)Random permutation block: Randomly permute packets of codewordCoding: Add sequence numbers (packet size = L + log(n) bits,alphabet size = n 2L)

More refined coding techniques simulate sequence numbers,e.g. [Mitzenmacher 2006], [Metzner 2009]

How do you code in such channelswithout increasing alphabet size?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 4 / 21

Example: Coding for Random Deletion Network

Consider a communication network where packets can be dropped:

NETWORK

SENDER RECEIVER

ERASURE CHANNEL

RANDOM PERMUTATION?

?1 2 33

31

1

Abstraction:

n-length codeword = sequence of n packetsErasure channel: Erase each symbol/packet of codewordindependently with probability p ∈ (0, 1)Random permutation block: Randomly permute packets of codewordCoding: Add sequence numbers and use standard coding techniques

More refined coding techniques simulate sequence numbers,e.g. [Mitzenmacher 2006], [Metzner 2009]

How do you code in such channelswithout increasing alphabet size?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 4 / 21

Example: Coding for Random Deletion Network

Consider a communication network where packets can be dropped:

NETWORK

SENDER RECEIVER

ERASURE CHANNEL

RANDOM PERMUTATION?

?1 2 33

31

1

Abstraction:

n-length codeword = sequence of n packetsErasure channel: Erase each symbol/packet of codewordindependently with probability p ∈ (0, 1)Random permutation block: Randomly permute packets of codewordCoding: Add sequence numbers and use standard coding techniquesMore refined coding techniques simulate sequence numbers,e.g. [Mitzenmacher 2006], [Metzner 2009]

How do you code in such channelswithout increasing alphabet size?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 4 / 21

Example: Coding for Random Deletion Network

Consider a communication network where packets can be dropped:

NETWORK

SENDER RECEIVER

ERASURE CHANNEL

RANDOM PERMUTATION?

?

Abstraction:

n-length codeword = sequence of n packetsErasure channel: Erase each symbol/packet of codewordindependently with probability p ∈ (0, 1)Random permutation block: Randomly permute packets of codeword

How do you code in such channelswithout increasing alphabet size?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 4 / 21

The Permutation Channel Model

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Sender sends message M ∼ Uniform(M)

Possibly randomized encoder fn :M→ X n produces codewordX n1 = (X1, . . . ,Xn) = fn(M) (with block-length n)

Discrete memoryless channel PZ |X with input and output alphabetsX and Y produces Zn

1 :

PZn1 |X n

1(zn1 |xn1 ) =

n∏i=1

PZ |X (zi |xi )

Random permutation generates Y n1 from Zn

1

Possibly randomized decoder gn : Yn →M produces estimateM = gn(Y n

1 ) at receiver

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 5 / 21

The Permutation Channel Model

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Sender sends message M ∼ Uniform(M)

Possibly randomized encoder fn :M→ X n produces codewordX n1 = (X1, . . . ,Xn) = fn(M) (with block-length n)

Discrete memoryless channel PZ |X with input and output alphabetsX and Y produces Zn

1 :

PZn1 |X n

1(zn1 |xn1 ) =

n∏i=1

PZ |X (zi |xi )

Random permutation generates Y n1 from Zn

1

Possibly randomized decoder gn : Yn →M produces estimateM = gn(Y n

1 ) at receiver

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 5 / 21

The Permutation Channel Model

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Sender sends message M ∼ Uniform(M)

Possibly randomized encoder fn :M→ X n produces codewordX n1 = (X1, . . . ,Xn) = fn(M) (with block-length n)

Discrete memoryless channel PZ |X with input and output alphabetsX and Y produces Zn

1 :

PZn1 |X n

1(zn1 |xn1 ) =

n∏i=1

PZ |X (zi |xi )

Random permutation generates Y n1 from Zn

1

Possibly randomized decoder gn : Yn →M produces estimateM = gn(Y n

1 ) at receiver

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 5 / 21

The Permutation Channel Model

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Sender sends message M ∼ Uniform(M)

Possibly randomized encoder fn :M→ X n produces codewordX n1 = (X1, . . . ,Xn) = fn(M) (with block-length n)

Discrete memoryless channel PZ |X with input and output alphabetsX and Y produces Zn

1 :

PZn1 |X n

1(zn1 |xn1 ) =

n∏i=1

PZ |X (zi |xi )

Random permutation generates Y n1 from Zn

1

Possibly randomized decoder gn : Yn →M produces estimateM = gn(Y n

1 ) at receiver

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 5 / 21

The Permutation Channel Model

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Sender sends message M ∼ Uniform(M)

Possibly randomized encoder fn :M→ X n produces codewordX n1 = (X1, . . . ,Xn) = fn(M) (with block-length n)

Discrete memoryless channel PZ |X with input and output alphabetsX and Y produces Zn

1 :

PZn1 |X n

1(zn1 |xn1 ) =

n∏i=1

PZ |X (zi |xi )

Random permutation generates Y n1 from Zn

1

Possibly randomized decoder gn : Yn →M produces estimateM = gn(Y n

1 ) at receiver

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 5 / 21

Coding for the Permutation Channel

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

General Principle:“Encode the information in an object that is invariant under the[permutation] transformation.” [Kovacevic-Vukobratovic 2013]

Multiset codes are studied in [Kovacevic-Vukobratovic 2013],[Kovacevic-Vukobratovic 2015], and [Kovacevic-Tan 2018]

What about the information theoreticaspects of this model?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 6 / 21

Coding for the Permutation Channel

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

General Principle:“Encode the information in an object that is invariant under the[permutation] transformation.” [Kovacevic-Vukobratovic 2013]

Multiset codes are studied in [Kovacevic-Vukobratovic 2013],[Kovacevic-Vukobratovic 2015], and [Kovacevic-Tan 2018]

What about the information theoreticaspects of this model?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 6 / 21

Coding for the Permutation Channel

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

General Principle:“Encode the information in an object that is invariant under the[permutation] transformation.” [Kovacevic-Vukobratovic 2013]

Multiset codes are studied in [Kovacevic-Vukobratovic 2013],[Kovacevic-Vukobratovic 2015], and [Kovacevic-Tan 2018]

What about the information theoreticaspects of this model?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 6 / 21

Information Capacity of the Permutation Channel

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Average probability of error Pnerror , P(M 6= M)

“Rate” of encoder-decoder pair (fn, gn):

R ,log(|M|)

log(n)

|M| = nR

Rate R ≥ 0 is achievable ⇔ ∃{(fn, gn)}n∈N such that limn→∞

Pnerror = 0

Definition (Permutation Channel Capacity)

Cperm(PZ |X ) , sup{R ≥ 0 : R is achievable}

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 7 / 21

Information Capacity of the Permutation Channel

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Average probability of error Pnerror , P(M 6= M)

“Rate” of encoder-decoder pair (fn, gn):

R ,log(|M|)

log(n)

|M| = nR

Rate R ≥ 0 is achievable ⇔ ∃{(fn, gn)}n∈N such that limn→∞

Pnerror = 0

Definition (Permutation Channel Capacity)

Cperm(PZ |X ) , sup{R ≥ 0 : R is achievable}

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 7 / 21

Information Capacity of the Permutation Channel

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Average probability of error Pnerror , P(M 6= M)

“Rate” of encoder-decoder pair (fn, gn):

R ,log(|M|)

log(n)

|M| = nR

Rate R ≥ 0 is achievable ⇔ ∃{(fn, gn)}n∈N such that limn→∞

Pnerror = 0

Definition (Permutation Channel Capacity)

Cperm(PZ |X ) , sup{R ≥ 0 : R is achievable}

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 7 / 21

Information Capacity of the Permutation Channel

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Average probability of error Pnerror , P(M 6= M)

“Rate” of encoder-decoder pair (fn, gn):

R ,log(|M|)

log(n)

|M| = nR because number of empirical distributions of Y n1 is poly(n)

Rate R ≥ 0 is achievable ⇔ ∃{(fn, gn)}n∈N such that limn→∞

Pnerror = 0

Definition (Permutation Channel Capacity)

Cperm(PZ |X ) , sup{R ≥ 0 : R is achievable}

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 7 / 21

Information Capacity of the Permutation Channel

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Average probability of error Pnerror , P(M 6= M)

“Rate” of encoder-decoder pair (fn, gn):

R ,log(|M|)

log(n)

|M| = nR

Rate R ≥ 0 is achievable ⇔ ∃{(fn, gn)}n∈N such that limn→∞

Pnerror = 0

Definition (Permutation Channel Capacity)

Cperm(PZ |X ) , sup{R ≥ 0 : R is achievable}

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 7 / 21

Information Capacity of the Permutation Channel

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Average probability of error Pnerror , P(M 6= M)

“Rate” of encoder-decoder pair (fn, gn):

R ,log(|M|)

log(n)

|M| = nR

Rate R ≥ 0 is achievable ⇔ ∃{(fn, gn)}n∈N such that limn→∞

Pnerror = 0

Definition (Permutation Channel Capacity)

Cperm(PZ |X ) , sup{R ≥ 0 : R is achievable}

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 7 / 21

Capacity of the BSC Permutation Channel

ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Channel is binary symmetric channel, denoted BSC(p):

∀z , x ∈ {0, 1}, PZ |X (z |x) =

{1− p, for z = x

p, for z 6= x

Alphabets are X = Y = {0, 1}Assume crossover probability p ∈ (0, 1) and p 6= 1

2

Main Question

What is the permutation channel capacity of the BSC?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 8 / 21

Capacity of the BSC Permutation Channel

ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Channel is binary symmetric channel, denoted BSC(p):

∀z , x ∈ {0, 1}, PZ |X (z |x) =

{1− p, for z = x

p, for z 6= x

Alphabets are X = Y = {0, 1}

Assume crossover probability p ∈ (0, 1) and p 6= 12

Main Question

What is the permutation channel capacity of the BSC?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 8 / 21

Capacity of the BSC Permutation Channel

ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Channel is binary symmetric channel, denoted BSC(p):

∀z , x ∈ {0, 1}, PZ |X (z |x) =

{1− p, for z = x

p, for z 6= x

Alphabets are X = Y = {0, 1}Assume crossover probability p ∈ (0, 1) and p 6= 1

2

Main Question

What is the permutation channel capacity of the BSC?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 8 / 21

Capacity of the BSC Permutation Channel

ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Channel is binary symmetric channel, denoted BSC(p):

∀z , x ∈ {0, 1}, PZ |X (z |x) =

{1− p, for z = x

p, for z 6= x

Alphabets are X = Y = {0, 1}Assume crossover probability p ∈ (0, 1) and p 6= 1

2

Main Question

What is the permutation channel capacity of the BSC?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 8 / 21

Outline

1 Introduction

2 AchievabilityEncoder and DecoderTesting between Converging HypothesesIntuition via Central Limit TheoremSecond Moment Method for TV Distance

3 Converse

4 Conclusion

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 9 / 21

Warm-up: Sending Two Messages

ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Fix a message m ∈ {0, 1}

, and encode m as fn(m) = X n1

i.i.d.∼ Ber(qm)

𝑞13

𝑞23

10

Memoryless BSC(p) outputs Zn1

i.i.d.∼ Ber(p ∗ qm), wherep ∗ qm , p(1− qm) + qm(1− p) is the convolution of p and qm

Random permutation generates Y n1

i.i.d.∼ Ber(p ∗ qm)

Maximum Likelihood (ML) decoder: M = 1{1n

∑ni=1 Yi ≥ 1

2

}1n

∑ni=1 Yi → p ∗ qm in probability as n→∞ [WLLN]

⇒ limn→∞

Pnerror = 0 as p ∗ q0 6= p ∗ q1

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 10 / 21

Warm-up: Sending Two Messages

ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Fix a message m ∈ {0, 1}, and encode m as fn(m) = X n1

i.i.d.∼ Ber(qm)

𝑞13

𝑞23

10

Memoryless BSC(p) outputs Zn1

i.i.d.∼ Ber(p ∗ qm), wherep ∗ qm , p(1− qm) + qm(1− p) is the convolution of p and qm

Random permutation generates Y n1

i.i.d.∼ Ber(p ∗ qm)

Maximum Likelihood (ML) decoder: M = 1{1n

∑ni=1 Yi ≥ 1

2

}1n

∑ni=1 Yi → p ∗ qm in probability as n→∞ [WLLN]

⇒ limn→∞

Pnerror = 0 as p ∗ q0 6= p ∗ q1

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 10 / 21

Warm-up: Sending Two Messages

ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Fix a message m ∈ {0, 1}, and encode m as fn(m) = X n1

i.i.d.∼ Ber(qm)

𝑞13

𝑞23

10

Memoryless BSC(p) outputs Zn1

i.i.d.∼ Ber(p ∗ qm), wherep ∗ qm , p(1− qm) + qm(1− p) is the convolution of p and qm

Random permutation generates Y n1

i.i.d.∼ Ber(p ∗ qm)

Maximum Likelihood (ML) decoder: M = 1{1n

∑ni=1 Yi ≥ 1

2

}1n

∑ni=1 Yi → p ∗ qm in probability as n→∞ [WLLN]

⇒ limn→∞

Pnerror = 0 as p ∗ q0 6= p ∗ q1

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 10 / 21

Warm-up: Sending Two Messages

ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Fix a message m ∈ {0, 1}, and encode m as fn(m) = X n1

i.i.d.∼ Ber(qm)

𝑞13

𝑞23

10

Memoryless BSC(p) outputs Zn1

i.i.d.∼ Ber(p ∗ qm), wherep ∗ qm , p(1− qm) + qm(1− p) is the convolution of p and qm

Random permutation generates Y n1

i.i.d.∼ Ber(p ∗ qm)

Maximum Likelihood (ML) decoder: M = 1{1n

∑ni=1 Yi ≥ 1

2

}1n

∑ni=1 Yi → p ∗ qm in probability as n→∞ [WLLN]

⇒ limn→∞

Pnerror = 0 as p ∗ q0 6= p ∗ q1

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 10 / 21

Warm-up: Sending Two Messages

ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Fix a message m ∈ {0, 1}, and encode m as fn(m) = X n1

i.i.d.∼ Ber(qm)

𝑞13

𝑞23

10

Memoryless BSC(p) outputs Zn1

i.i.d.∼ Ber(p ∗ qm), wherep ∗ qm , p(1− qm) + qm(1− p) is the convolution of p and qm

Random permutation generates Y n1

i.i.d.∼ Ber(p ∗ qm)

Maximum Likelihood (ML) decoder: M = 1{1n

∑ni=1 Yi ≥ 1

2

}

1n

∑ni=1 Yi → p ∗ qm in probability as n→∞ [WLLN]

⇒ limn→∞

Pnerror = 0 as p ∗ q0 6= p ∗ q1

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 10 / 21

Warm-up: Sending Two Messages

ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Fix a message m ∈ {0, 1}, and encode m as fn(m) = X n1

i.i.d.∼ Ber(qm)

𝑞13

𝑞23

10

Memoryless BSC(p) outputs Zn1

i.i.d.∼ Ber(p ∗ qm), wherep ∗ qm , p(1− qm) + qm(1− p) is the convolution of p and qm

Random permutation generates Y n1

i.i.d.∼ Ber(p ∗ qm)

Maximum Likelihood (ML) decoder: M = 1{1n

∑ni=1 Yi ≥ 1

2

}1n

∑ni=1 Yi → p ∗ qm in probability as n→∞ [WLLN]

⇒ limn→∞

Pnerror = 0 as p ∗ q0 6= p ∗ q1

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 10 / 21

Encoder and Decoder

Suppose M = {1, . . . , nR} for some R > 0

Randomized encoder: Given m ∈M, fn(m) = X n1

i.i.d.∼ Ber( m

nR

)10

𝑛

Given m ∈M, Y n1

i.i.d.∼ Ber(p ∗ m

nR

)ML decoder: For yn1 ∈ {0, 1}n, gn(yn1 ) = arg max

m∈MPY n

1 |M(yn1 |m)

Challenge: Although 1n

∑ni=1 Yi → p ∗ m

nRin probability as n→∞,

consecutive messages become indistinguishable i.e. mnR− m+1

nR→ 0

Fact: Consecutive messages distinguishable ⇒ limn→∞

Pnerror = 0

What is the largest R such that two consecutive messagescan be distinguished?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 11 / 21

Encoder and Decoder

Suppose M = {1, . . . , nR} for some R > 0

Randomized encoder: Given m ∈M, fn(m) = X n1

i.i.d.∼ Ber( m

nR

)10

𝑛

Given m ∈M, Y n1

i.i.d.∼ Ber(p ∗ m

nR

)ML decoder: For yn1 ∈ {0, 1}n, gn(yn1 ) = arg max

m∈MPY n

1 |M(yn1 |m)

Challenge: Although 1n

∑ni=1 Yi → p ∗ m

nRin probability as n→∞,

consecutive messages become indistinguishable i.e. mnR− m+1

nR→ 0

Fact: Consecutive messages distinguishable ⇒ limn→∞

Pnerror = 0

What is the largest R such that two consecutive messagescan be distinguished?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 11 / 21

Encoder and Decoder

Suppose M = {1, . . . , nR} for some R > 0

Randomized encoder: Given m ∈M, fn(m) = X n1

i.i.d.∼ Ber( m

nR

)10

𝑛

Given m ∈M, Y n1

i.i.d.∼ Ber(p ∗ m

nR

)(as before)

ML decoder: For yn1 ∈ {0, 1}n, gn(yn1 ) = arg maxm∈M

PY n1 |M(yn1 |m)

Challenge: Although 1n

∑ni=1 Yi → p ∗ m

nRin probability as n→∞,

consecutive messages become indistinguishable i.e. mnR− m+1

nR→ 0

Fact: Consecutive messages distinguishable ⇒ limn→∞

Pnerror = 0

What is the largest R such that two consecutive messagescan be distinguished?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 11 / 21

Encoder and Decoder

Suppose M = {1, . . . , nR} for some R > 0

Randomized encoder: Given m ∈M, fn(m) = X n1

i.i.d.∼ Ber( m

nR

)10

𝑛

Given m ∈M, Y n1

i.i.d.∼ Ber(p ∗ m

nR

)ML decoder: For yn1 ∈ {0, 1}n, gn(yn1 ) = arg max

m∈MPY n

1 |M(yn1 |m)

Challenge: Although 1n

∑ni=1 Yi → p ∗ m

nRin probability as n→∞,

consecutive messages become indistinguishable i.e. mnR− m+1

nR→ 0

Fact: Consecutive messages distinguishable ⇒ limn→∞

Pnerror = 0

What is the largest R such that two consecutive messagescan be distinguished?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 11 / 21

Encoder and Decoder

Suppose M = {1, . . . , nR} for some R > 0

Randomized encoder: Given m ∈M, fn(m) = X n1

i.i.d.∼ Ber( m

nR

)10

𝑛

Given m ∈M, Y n1

i.i.d.∼ Ber(p ∗ m

nR

)ML decoder: For yn1 ∈ {0, 1}n, gn(yn1 ) = arg max

m∈MPY n

1 |M(yn1 |m)

Challenge: Although 1n

∑ni=1 Yi → p ∗ m

nRin probability as n→∞,

consecutive messages become indistinguishable i.e. mnR− m+1

nR→ 0

Fact: Consecutive messages distinguishable ⇒ limn→∞

Pnerror = 0

What is the largest R such that two consecutive messagescan be distinguished?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 11 / 21

Encoder and Decoder

Suppose M = {1, . . . , nR} for some R > 0

Randomized encoder: Given m ∈M, fn(m) = X n1

i.i.d.∼ Ber( m

nR

)10

𝑛

Given m ∈M, Y n1

i.i.d.∼ Ber(p ∗ m

nR

)ML decoder: For yn1 ∈ {0, 1}n, gn(yn1 ) = arg max

m∈MPY n

1 |M(yn1 |m)

Challenge: Although 1n

∑ni=1 Yi → p ∗ m

nRin probability as n→∞,

consecutive messages become indistinguishable i.e. mnR− m+1

nR→ 0

Fact: Consecutive messages distinguishable ⇒ limn→∞

Pnerror = 0

What is the largest R such that two consecutive messagescan be distinguished?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 11 / 21

Encoder and Decoder

Suppose M = {1, . . . , nR} for some R > 0

Randomized encoder: Given m ∈M, fn(m) = X n1

i.i.d.∼ Ber( m

nR

)10

𝑛

Given m ∈M, Y n1

i.i.d.∼ Ber(p ∗ m

nR

)ML decoder: For yn1 ∈ {0, 1}n, gn(yn1 ) = arg max

m∈MPY n

1 |M(yn1 |m)

Challenge: Although 1n

∑ni=1 Yi → p ∗ m

nRin probability as n→∞,

consecutive messages become indistinguishable i.e. mnR− m+1

nR→ 0

Fact: Consecutive messages distinguishable ⇒ limn→∞

Pnerror = 0

What is the largest R such that two consecutive messagescan be distinguished?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 11 / 21

Testing between Converging Hypotheses

Binary Hypothesis Testing:

Consider hypothesis H ∼ Ber(12

)with uniform prior

For any n ∈ N, q ∈ (0, 1), and R > 0, consider likelihoods:

Given H = 0 : X n1

i.i.d.∼ PX |H=0 = Ber(q)

Given H = 1 : X n1

i.i.d.∼ PX |H=1 = Ber

(q +

1

nR

)Define the zero-mean sufficient statistic of X n

1 for H:

Tn ,1

n

n∑i=1

Xi − q − 1

2nR

Let HnML(Tn) denote the ML decoder for H based on Tn with

minimum probability of error PnML , P(Hn

ML(Tn) 6= H)

Want: Largest R > 0 such that limn→∞

PnML = 0?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 12 / 21

Testing between Converging Hypotheses

Binary Hypothesis Testing:

Consider hypothesis H ∼ Ber(12

)with uniform prior

For any n ∈ N, q ∈ (0, 1), and R > 0, consider likelihoods:

Given H = 0 : X n1

i.i.d.∼ PX |H=0 = Ber(q)

Given H = 1 : X n1

i.i.d.∼ PX |H=1 = Ber

(q +

1

nR

)

Define the zero-mean sufficient statistic of X n1 for H:

Tn ,1

n

n∑i=1

Xi − q − 1

2nR

Let HnML(Tn) denote the ML decoder for H based on Tn with

minimum probability of error PnML , P(Hn

ML(Tn) 6= H)

Want: Largest R > 0 such that limn→∞

PnML = 0?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 12 / 21

Testing between Converging Hypotheses

Binary Hypothesis Testing:

Consider hypothesis H ∼ Ber(12

)with uniform prior

For any n ∈ N, q ∈ (0, 1), and R > 0, consider likelihoods:

Given H = 0 : X n1

i.i.d.∼ PX |H=0 = Ber(q)

Given H = 1 : X n1

i.i.d.∼ PX |H=1 = Ber

(q +

1

nR

)Define the zero-mean sufficient statistic of X n

1 for H:

Tn ,1

n

n∑i=1

Xi − q − 1

2nR

Let HnML(Tn) denote the ML decoder for H based on Tn with

minimum probability of error PnML , P(Hn

ML(Tn) 6= H)

Want: Largest R > 0 such that limn→∞

PnML = 0?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 12 / 21

Testing between Converging Hypotheses

Binary Hypothesis Testing:

Consider hypothesis H ∼ Ber(12

)with uniform prior

For any n ∈ N, q ∈ (0, 1), and R > 0, consider likelihoods:

Given H = 0 : X n1

i.i.d.∼ PX |H=0 = Ber(q)

Given H = 1 : X n1

i.i.d.∼ PX |H=1 = Ber

(q +

1

nR

)Define the zero-mean sufficient statistic of X n

1 for H:

Tn ,1

n

n∑i=1

Xi − q − 1

2nR

Let HnML(Tn) denote the ML decoder for H based on Tn with

minimum probability of error PnML , P(Hn

ML(Tn) 6= H)

Want: Largest R > 0 such that limn→∞

PnML = 0?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 12 / 21

Testing between Converging Hypotheses

Binary Hypothesis Testing:

Consider hypothesis H ∼ Ber(12

)with uniform prior

For any n ∈ N, q ∈ (0, 1), and R > 0, consider likelihoods:

Given H = 0 : X n1

i.i.d.∼ PX |H=0 = Ber(q)

Given H = 1 : X n1

i.i.d.∼ PX |H=1 = Ber

(q +

1

nR

)Define the zero-mean sufficient statistic of X n

1 for H:

Tn ,1

n

n∑i=1

Xi − q − 1

2nR

Let HnML(Tn) denote the ML decoder for H based on Tn with

minimum probability of error PnML , P(Hn

ML(Tn) 6= H)

Want: Largest R > 0 such that limn→∞

PnML = 0?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 12 / 21

Intuition via Central Limit Theorem

For large n, PTn|H(·|0) and PTn|H(·|1) are Gaussian distributions [CLT]

|E[Tn|H = 0]− E[Tn|H = 1]| = 1/nR

Standard deviations are Θ(1/√n)

Figure:

𝑡0

𝑃 | 𝑡|0 𝑃 | 𝑡|1

12𝑛

12𝑛

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 13 / 21

Intuition via Central Limit Theorem

For large n, PTn|H(·|0) and PTn|H(·|1) are Gaussian distributions [CLT]

|E[Tn|H = 0]− E[Tn|H = 1]| = 1/nR

Standard deviations are Θ(1/√n)

Figure:

𝑡0

𝑃 | 𝑡|0 𝑃 | 𝑡|1

𝛳1𝑛

1𝑛

𝛳1𝑛

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 13 / 21

Intuition via Central Limit Theorem

For large n, PTn|H(·|0) and PTn|H(·|1) are Gaussian distributions [CLT]

|E[Tn|H = 0]− E[Tn|H = 1]| = 1/nR

Standard deviations are Θ(1/√n)

Figure:

𝑡0

𝑃 | 𝑡|0 𝑃 | 𝑡|1

𝛳1𝑛

1𝑛

𝛳1𝑛

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 13 / 21

Intuition via Central Limit Theorem

For large n, PTn|H(·|0) and PTn|H(·|1) are Gaussian distributions [CLT]

|E[Tn|H = 0]− E[Tn|H = 1]| = 1/nR

Standard deviations are Θ(1/√n)

Case R < 12 :

𝑡0

𝑃 | 𝑡|0 𝑃 | 𝑡|1

𝛳1𝑛

1𝑛

𝛳1𝑛

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 13 / 21

Intuition via Central Limit Theorem

For large n, PTn|H(·|0) and PTn|H(·|1) are Gaussian distributions [CLT]

|E[Tn|H = 0]− E[Tn|H = 1]| = 1/nR

Standard deviations are Θ(1/√n)

Case R < 12 : Decoding is possible ,

𝑡0

𝑃 | 𝑡|0 𝑃 | 𝑡|1

𝛳1𝑛

1𝑛

𝛳1𝑛

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 13 / 21

Intuition via Central Limit Theorem

For large n, PTn|H(·|0) and PTn|H(·|1) are Gaussian distributions [CLT]

|E[Tn|H = 0]− E[Tn|H = 1]| = 1/nR

Standard deviations are Θ(1/√n)

Case R > 12 :

𝑡0

𝑃 | 𝑡|0 𝑃 | 𝑡|1

𝛳1𝑛

1𝑛

𝛳1𝑛

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 13 / 21

Intuition via Central Limit Theorem

For large n, PTn|H(·|0) and PTn|H(·|1) are Gaussian distributions [CLT]

|E[Tn|H = 0]− E[Tn|H = 1]| = 1/nR

Standard deviations are Θ(1/√n)

Case R > 12 : Decoding is impossible /

𝑡0

𝑃 | 𝑡|0 𝑃 | 𝑡|1

𝛳1𝑛

1𝑛

𝛳1𝑛

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 13 / 21

Second Moment Method for TV Distance

Lemma (2nd Moment Method [Evans-Kenyon-Peres-Schulman 2000])∥∥PTn|H=1 − PTn|H=0

∥∥TV≥ (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

where ‖P − Q‖TV = 12 ‖P − Q‖`1 is the total variation (TV) distance

between the distributions P and Q.

Proof:

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 14 / 21

Second Moment Method for TV Distance

Lemma (2nd Moment Method [Evans-Kenyon-Peres-Schulman 2000])∥∥PTn|H=1 − PTn|H=0

∥∥TV≥ (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

where ‖P − Q‖TV = 12 ‖P − Q‖`1 is the total variation (TV) distance

between the distributions P and Q.

Proof: Let T+n ∼ PTn|H=1 and T−n ∼ PTn|H=0(

E[T+n

]− E

[T−n])2

=

(∑t

t(PTn|H(t|1)− PTn|H(t|0)

))2

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 14 / 21

Second Moment Method for TV Distance

Lemma (2nd Moment Method [Evans-Kenyon-Peres-Schulman 2000])∥∥PTn|H=1 − PTn|H=0

∥∥TV≥ (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

where ‖P − Q‖TV = 12 ‖P − Q‖`1 is the total variation (TV) distance

between the distributions P and Q.

Proof: Let T+n ∼ PTn|H=1 and T−n ∼ PTn|H=0(

E[T+n

]− E

[T−n])2

=

(∑t

t√PTn(t)

(PTn|H(t|1)− PTn|H(t|0)

)√PTn(t)

)2

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 14 / 21

Second Moment Method for TV Distance

Lemma (2nd Moment Method [Evans-Kenyon-Peres-Schulman 2000])∥∥PTn|H=1 − PTn|H=0

∥∥TV≥ (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

where ‖P − Q‖TV = 12 ‖P − Q‖`1 is the total variation (TV) distance

between the distributions P and Q.

Proof: Cauchy-Schwarz inequality

(E[T+n

]− E

[T−n])2

=

(∑t

t√

PTn(t)

(PTn|H(t|1)− PTn|H(t|0)

)√PTn(t)

)2

(∑t

t2PTn(t)

)(∑t

(PTn|H(t|1)− PTn|H(t|0)

)2PTn(t)

)

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 14 / 21

Second Moment Method for TV Distance

Lemma (2nd Moment Method [Evans-Kenyon-Peres-Schulman 2000])∥∥PTn|H=1 − PTn|H=0

∥∥TV≥ (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

where ‖P − Q‖TV = 12 ‖P − Q‖`1 is the total variation (TV) distance

between the distributions P and Q.

Proof: Recall that Tn is zero-mean(E[T+n

]− E

[T−n])2

=

(∑t

t√

PTn(t)

(PTn|H(t|1)− PTn|H(t|0)

)√PTn(t)

)2

≤ VAR(Tn)

(∑t

(PTn|H(t|1)− PTn|H(t|0)

)2PTn(t)

)

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 14 / 21

Second Moment Method for TV Distance

Lemma (2nd Moment Method [Evans-Kenyon-Peres-Schulman 2000])∥∥PTn|H=1 − PTn|H=0

∥∥TV≥ (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

where ‖P − Q‖TV = 12 ‖P − Q‖`1 is the total variation (TV) distance

between the distributions P and Q.

Proof: Hammersley-Chapman-Robbins bound

(E[T+n

]− E

[T−n])2

=

(∑t

t√

PTn(t)

(PTn|H(t|1)− PTn|H(t|0)

)√PTn(t)

)2

≤ 4VAR(Tn)

(1

4

∑t

(PTn|H(t|1)− PTn|H(t|0)

)2PTn(t)

)︸ ︷︷ ︸

Vincze-Le Cam distance

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 14 / 21

Second Moment Method for TV Distance

Lemma (2nd Moment Method [Evans-Kenyon-Peres-Schulman 2000])∥∥PTn|H=1 − PTn|H=0

∥∥TV≥ (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

where ‖P − Q‖TV = 12 ‖P − Q‖`1 is the total variation (TV) distance

between the distributions P and Q.

Proof:(E[T+n

]− E

[T−n])2

=

(∑t

t√

PTn(t)

(PTn|H(t|1)− PTn|H(t|0)

)√PTn(t)

)2

≤ 4VAR(Tn)

(1

4

∑t

(PTn|H(t|1)− PTn|H(t|0)

)2PTn(t)

)≤ 4VAR(Tn)

∥∥PTn|H=1 − PTn|H=0

∥∥TV

�A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 14 / 21

Achievability Proof

Theorem (Achievability)

For any 0 < R < 1/2, consider the binary hypothesis testing problem with

H ∼ Ber(12

), and X n

1i.i.d.∼ Ber

(q + h

nR

)given H = h ∈ {0, 1}.

Then, limn→∞

PnML = 0. This implies that:

Cperm(BSC(p)) ≥ 1

2.

Proof: Start with Le Cam’s relation

PnML =

1

2

(1−

∥∥PTn|H=1 − PTn|H=0

∥∥TV

)

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 15 / 21

Achievability Proof

Theorem (Achievability)

For any 0 < R < 1/2, consider the binary hypothesis testing problem with

H ∼ Ber(12

), and X n

1i.i.d.∼ Ber

(q + h

nR

)given H = h ∈ {0, 1}.

Then, limn→∞

PnML = 0. This implies that:

Cperm(BSC(p)) ≥ 1

2.

Proof: Apply second moment method lemma

PnML =

1

2

(1−

∥∥PTn|H=1 − PTn|H=0

∥∥TV

)≤ 1

2

(1− (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

)

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 15 / 21

Achievability Proof

Theorem (Achievability)

For any 0 < R < 1/2, consider the binary hypothesis testing problem with

H ∼ Ber(12

), and X n

1i.i.d.∼ Ber

(q + h

nR

)given H = h ∈ {0, 1}.

Then, limn→∞

PnML = 0. This implies that:

Cperm(BSC(p)) ≥ 1

2.

Proof: After explicit computation and simplification...

PnML =

1

2

(1−

∥∥PTn|H=1 − PTn|H=0

∥∥TV

)≤ 1

2

(1− (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

)

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 15 / 21

Achievability Proof

Theorem (Achievability)

For any 0 < R < 1/2, consider the binary hypothesis testing problem with

H ∼ Ber(12

), and X n

1i.i.d.∼ Ber

(q + h

nR

)given H = h ∈ {0, 1}.

Then, limn→∞

PnML = 0. This implies that:

Cperm(BSC(p)) ≥ 1

2.

Proof: For any 0 < R < 12 ,

PnML =

1

2

(1−

∥∥PTn|H=1 − PTn|H=0

∥∥TV

)≤ 1

2

(1− (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

)≤ 3

2n1−2R

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 15 / 21

Achievability Proof

Theorem (Achievability)

For any 0 < R < 1/2, consider the binary hypothesis testing problem with

H ∼ Ber(12

), and X n

1i.i.d.∼ Ber

(q + h

nR

)given H = h ∈ {0, 1}.

Then, limn→∞

PnML = 0.

This implies that:

Cperm(BSC(p)) ≥ 1

2.

Proof: For any 0 < R < 12 ,

PnML =

1

2

(1−

∥∥PTn|H=1 − PTn|H=0

∥∥TV

)≤ 1

2

(1− (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

)≤ 3

2n1−2R→ 0 as n→∞

�A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 15 / 21

Achievability Proof

Theorem (Achievability)

For any 0 < R < 1/2, consider the binary hypothesis testing problem with

H ∼ Ber(12

), and X n

1i.i.d.∼ Ber

(q + h

nR

)given H = h ∈ {0, 1}.

Then, limn→∞

PnML = 0. This implies that:

Cperm(BSC(p)) ≥ 1

2.

Proof: For any 0 < R < 12 ,

PnML =

1

2

(1−

∥∥PTn|H=1 − PTn|H=0

∥∥TV

)≤ 1

2

(1− (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

)≤ 3

2n1−2R→ 0 as n→∞

�A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 15 / 21

Outline

1 Introduction

2 Achievability

3 ConverseFano’s Inequality ArgumentCLT Approximation

4 Conclusion

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 16 / 21

Converse: Fano’s Inequality Argument

Consider the Markov chain M → X n1 → Zn

1 → Y n1 → Sn ,

∑ni=1 Yi ,

and a sequence of encoder-decoder pairs {(fn, gn)}n∈N such that|M| = nR and lim

n→∞Pnerror = 0

Standard argument, cf. [Cover-Thomas 2006]:

R log(n)

= H(M|M) + I (M; M)

≤ 1 + PnerrorR log(n) + I (M;Y n

1 )

= 1 + PnerrorR log(n) + I (M;Sn)

≤ 1 + PnerrorR log(n) + I (X n

1 ;Sn)

Divide by log(n)

and let n→∞:

R ≤ I (X n1 ;Sn)

log(n)

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 17 / 21

Converse: Fano’s Inequality Argument

Consider the Markov chain M → X n1 → Zn

1 → Y n1 → Sn ,

∑ni=1 Yi ,

and a sequence of encoder-decoder pairs {(fn, gn)}n∈N such that|M| = nR and lim

n→∞Pnerror = 0

Standard argument, cf. [Cover-Thomas 2006]: M is uniform

R log(n) = H(M)

= H(M|M) + I (M; M)

≤ 1 + PnerrorR log(n) + I (M;Y n

1 )

= 1 + PnerrorR log(n) + I (M;Sn)

≤ 1 + PnerrorR log(n) + I (X n

1 ;Sn)

Divide by log(n)

and let n→∞:

R ≤ I (X n1 ;Sn)

log(n)

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 17 / 21

Converse: Fano’s Inequality Argument

Consider the Markov chain M → X n1 → Zn

1 → Y n1 → Sn ,

∑ni=1 Yi ,

and a sequence of encoder-decoder pairs {(fn, gn)}n∈N such that|M| = nR and lim

n→∞Pnerror = 0

Standard argument, cf. [Cover-Thomas 2006]: Fano’s inequality, DPI

R log(n) = H(M|M) + I (M; M)

≤ 1 + PnerrorR log(n) + I (M;Y n

1 )

= 1 + PnerrorR log(n) + I (M;Sn)

≤ 1 + PnerrorR log(n) + I (X n

1 ;Sn)

Divide by log(n)

and let n→∞:

R ≤ I (X n1 ;Sn)

log(n)

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 17 / 21

Converse: Fano’s Inequality Argument

Consider the Markov chain M → X n1 → Zn

1 → Y n1 → Sn ,

∑ni=1 Yi ,

and a sequence of encoder-decoder pairs {(fn, gn)}n∈N such that|M| = nR and lim

n→∞Pnerror = 0

Standard argument, cf. [Cover-Thomas 2006]: sufficiency

R log(n) = H(M|M) + I (M; M)

≤ 1 + PnerrorR log(n) + I (M;Y n

1 )

= 1 + PnerrorR log(n) + I (M; Sn)

≤ 1 + PnerrorR log(n) + I (X n

1 ;Sn)

Divide by log(n)

and let n→∞:

R ≤ I (X n1 ;Sn)

log(n)

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 17 / 21

Converse: Fano’s Inequality Argument

Consider the Markov chain M → X n1 → Zn

1 → Y n1 → Sn ,

∑ni=1 Yi ,

and a sequence of encoder-decoder pairs {(fn, gn)}n∈N such that|M| = nR and lim

n→∞Pnerror = 0

Standard argument, cf. [Cover-Thomas 2006]: DPI

R log(n) = H(M|M) + I (M; M)

≤ 1 + PnerrorR log(n) + I (M;Y n

1 )

= 1 + PnerrorR log(n) + I (M; Sn)

≤ 1 + PnerrorR log(n) + I (X n

1 ;Sn)

Divide by log(n)

and let n→∞:

R ≤ I (X n1 ;Sn)

log(n)

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 17 / 21

Converse: Fano’s Inequality Argument

Consider the Markov chain M → X n1 → Zn

1 → Y n1 → Sn ,

∑ni=1 Yi ,

and a sequence of encoder-decoder pairs {(fn, gn)}n∈N such that|M| = nR and lim

n→∞Pnerror = 0

Standard argument, cf. [Cover-Thomas 2006]:

R log(n) = H(M|M) + I (M; M)

≤ 1 + PnerrorR log(n) + I (M;Y n

1 )

= 1 + PnerrorR log(n) + I (M; Sn)

≤ 1 + PnerrorR log(n) + I (X n

1 ;Sn)

Divide by log(n)

and let n→∞:

R ≤ 1

log(n)+ Pn

errorR +I (X n

1 ;Sn)

log(n)

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 17 / 21

Converse: Fano’s Inequality Argument

Consider the Markov chain M → X n1 → Zn

1 → Y n1 → Sn ,

∑ni=1 Yi ,

and a sequence of encoder-decoder pairs {(fn, gn)}n∈N such that|M| = nR and lim

n→∞Pnerror = 0

Standard argument, cf. [Cover-Thomas 2006]:

R log(n) = H(M|M) + I (M; M)

≤ 1 + PnerrorR log(n) + I (M;Y n

1 )

= 1 + PnerrorR log(n) + I (M; Sn)

≤ 1 + PnerrorR log(n) + I (X n

1 ;Sn)

Divide by log(n) and let n→∞:

R ≤ limn→∞

I (X n1 ; Sn)

log(n)

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 17 / 21

Converse: CLT Approximation

Upper bound on I (X n1 ;Sn):

I (X n1 ;Sn) = H(Sn)− H(Sn|X n

1 )

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H(bin(k , 1− p) + bin(n − k , p))

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H

(bin(n

2, p))

= log(n + 1)− 1

2log(πep(1− p)n) + O

(1

n

)Hence, we have:

R ≤ limn→∞

I (X n1 ;Sn)

log(n)=

1

2

Theorem (Converse)

Cperm(BSC(p)) ≤ 1

2

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 18 / 21

Converse: CLT Approximation

Since Sn ∈ {0, . . . , n},I (X n

1 ;Sn) = H(Sn)− H(Sn|X n1 )

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H(Sn|X n

1 = xn1 )

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H(bin(k , 1− p) + bin(n − k , p))

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H

(bin(n

2, p))

= log(n + 1)− 1

2log(πep(1− p)n) + O

(1

n

)Hence, we have:

R ≤ limn→∞

I (X n1 ;Sn)

log(n)=

1

2

Theorem (Converse)

Cperm(BSC(p)) ≤ 1

2

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 18 / 21

Converse: CLT Approximation

Given X n1 = xn1 with

∑ni=1 xi = k , Sn = bin(k , 1− p) + bin(n − k , p):

I (X n1 ;Sn) = H(Sn)− H(Sn|X n

1 )

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H(bin(k , 1− p) + bin(n − k , p))

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H

(bin(n

2, p))

= log(n + 1)− 1

2log(πep(1− p)n) + O

(1

n

)Hence, we have:

R ≤ limn→∞

I (X n1 ;Sn)

log(n)=

1

2

Theorem (Converse)

Cperm(BSC(p)) ≤ 1

2

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 18 / 21

Converse: CLT Approximation

Using Problem 2.14 in [Cover-Thomas 2006],

I (X n1 ;Sn) = H(Sn)− H(Sn|X n

1 )

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H(bin(k , 1− p) + bin(n − k , p))

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H

(bin(n

2, p))

= log(n + 1)− 1

2log(πep(1− p)n) + O

(1

n

)Hence, we have:

R ≤ limn→∞

I (X n1 ;Sn)

log(n)=

1

2

Theorem (Converse)

Cperm(BSC(p)) ≤ 1

2

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 18 / 21

Converse: CLT Approximation

Approximate binomial entropy using CLT, cf. [Adell-Lekuona-Yu 2010]:

I (X n1 ;Sn) = H(Sn)− H(Sn|X n

1 )

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H(bin(k , 1− p) + bin(n − k , p))

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H

(bin(n

2, p))

= log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )

(1

2log(πep(1− p)n) + O

(1

n

))

= log(n + 1)− 1

2log(πep(1− p)n) + O

(1

n

)Hence, we have:

R ≤ limn→∞

I (X n1 ;Sn)

log(n)=

1

2

Theorem (Converse)

Cperm(BSC(p)) ≤ 1

2

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 18 / 21

Converse: CLT Approximation

Upper bound on I (X n1 ;Sn):

I (X n1 ;Sn) = H(Sn)− H(Sn|X n

1 )

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H(bin(k , 1− p) + bin(n − k , p))

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H

(bin(n

2, p))

= log(n + 1)− 1

2log(πep(1− p)n) + O

(1

n

)

Hence, we have:

R ≤ limn→∞

I (X n1 ;Sn)

log(n)=

1

2

Theorem (Converse)

Cperm(BSC(p)) ≤ 1

2

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 18 / 21

Converse: CLT Approximation

Upper bound on I (X n1 ;Sn):

I (X n1 ;Sn) = H(Sn)− H(Sn|X n

1 )

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H(bin(k , 1− p) + bin(n − k , p))

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H

(bin(n

2, p))

= log(n + 1)− 1

2log(πep(1− p)n) + O

(1

n

)Hence, we have:

R ≤ limn→∞

I (X n1 ; Sn)

log(n)=

1

2

Theorem (Converse)

Cperm(BSC(p)) ≤ 1

2

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 18 / 21

Converse: CLT Approximation

Upper bound on I (X n1 ;Sn):

I (X n1 ;Sn) = H(Sn)− H(Sn|X n

1 )

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H(bin(k , 1− p) + bin(n − k , p))

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H

(bin(n

2, p))

= log(n + 1)− 1

2log(πep(1− p)n) + O

(1

n

)Hence, we have:

R ≤ limn→∞

I (X n1 ; Sn)

log(n)=

1

2

Theorem (Converse)

Cperm(BSC(p)) ≤ 1

2

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 18 / 21

Outline

1 Introduction

2 Achievability

3 Converse

4 Conclusion

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 19 / 21

Conclusion

Theorem (Pemutation Channel Capacity of BSC)

Cperm(BSC(p)) =

1, for p = 0, 112 , for p ∈

(0, 12)∪(12 , 1)

0, for p = 12

𝑝0

𝐶perm BSC 𝑝

0

1

112

12

Remarks:

Cperm(·) is discontinuousand non-convex

Cperm(·) is generally agnosticto parameters of channel

Computationally tractablecoding scheme in proof

Proof technique yields moregeneral results

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 20 / 21

Conclusion

Theorem (Pemutation Channel Capacity of BSC)

Cperm(BSC(p)) =

1, for p = 0, 112 , for p ∈

(0, 12)∪(12 , 1)

0, for p = 12

𝑝0

𝐶perm BSC 𝑝

0

1

112

12

Remarks:

Cperm(·) is discontinuousand non-convex

Cperm(·) is generally agnosticto parameters of channel

Computationally tractablecoding scheme in proof

Proof technique yields moregeneral results

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 20 / 21

Conclusion

Theorem (Pemutation Channel Capacity of BSC)

Cperm(BSC(p)) =

1, for p = 0, 112 , for p ∈

(0, 12)∪(12 , 1)

0, for p = 12

𝑝0

𝐶perm BSC 𝑝

0

1

112

12

Remarks:

Cperm(·) is discontinuousand non-convex

Cperm(·) is generally agnosticto parameters of channel

Computationally tractablecoding scheme in proof

Proof technique yields moregeneral results

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 20 / 21

Conclusion

Theorem (Pemutation Channel Capacity of BSC)

Cperm(BSC(p)) =

1, for p = 0, 112 , for p ∈

(0, 12)∪(12 , 1)

0, for p = 12

𝑝0

𝐶perm BSC 𝑝

0

1

112

12

Remarks:

Cperm(·) is discontinuousand non-convex

Cperm(·) is generally agnosticto parameters of channel

Computationally tractablecoding scheme in proof

Proof technique yields moregeneral results

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 20 / 21

Conclusion

Theorem (Pemutation Channel Capacity of BSC)

Cperm(BSC(p)) =

1, for p = 0, 112 , for p ∈

(0, 12)∪(12 , 1)

0, for p = 12

𝑝0

𝐶perm BSC 𝑝

0

1

112

12

Remarks:

Cperm(·) is discontinuousand non-convex

Cperm(·) is generally agnosticto parameters of channel

Computationally tractablecoding scheme in proof

Proof technique yields moregeneral results

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 20 / 21

Thank You!

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 21 / 21