Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Least Squares Superposition Codes of ModerateDictionary Size,
Reliable at Rates up to Capacity
Antony JosephDepartment of Statistics
Yale University
Joint work with Andrew Barron
June 13-18, 20092010 IEEE International Symposium on Information Theory
Barron, Joseph (Yale) Least Squares, Superposition codes 1 / 22
Outline
1 The Gaussian channel
2 Sparse Superposition Codes
3 Partitioned Superposition CodesEncodingPerformance of Least Squares Decoding
4 Analysis of Reliability at Rates up to Capacity
Barron, Joseph (Yale) Least Squares, Superposition codes 2 / 22
Gaussian Channel. (Power P , Noise variance σ2)
uMessage
(length K bits)
Encoder xCodeword(length n)
Channel
εNoise ∼ Normal(0, σ2In)
y Decoder u
Characteristics
Power constraint: Ave‖x‖2≤nP
signal-to-noise: v = P/σ2
Capacity: 12 log(1 + P
σ2 )
Interested in
Rate: R = K/n
Prob{u 6= u}Low fraction of mistakes
Barron, Joseph (Yale) Least Squares, Superposition codes 3 / 22
Gaussian Channel. (Power P , Noise variance σ2)
uMessage
(length K bits)
Encoder xCodeword(length n)
Channel
εNoise ∼ Normal(0, σ2In)
y Decoder u
Characteristics
Power constraint: Ave‖x‖2≤nP
signal-to-noise: v = P/σ2
Capacity: 12 log(1 + P
σ2 )
Interested in
Rate: R = K/n
Prob{u 6= u}Low fraction of mistakes
Barron, Joseph (Yale) Least Squares, Superposition codes 3 / 22
Gaussian Channel
Challenges in Code construction
Achieve Rate arbitrarily close to Capacity
Good error exponents
Manageable codebook
Fast Encoding
Fast Decoding
Barron, Joseph (Yale) Least Squares, Superposition codes 4 / 22
Sparse Superposition Codes
Model : Y = Xβ + ε
uMessage
(length K)
β
Encoder
XβCodeword(length n)
Channel
εNoise ∼ N(0, σ2In)
Y Decoder u
Code design
n × N design matrix X
Received string: Y = Xβ + ε
Coefficient vector β is sparse is non-zero values
Not a linear code in algebraic coding sense
Barron, Joseph (Yale) Least Squares, Superposition codes 5 / 22
Sparse Superposition Codes
Model : Y = Xβ + ε
uMessage
(length K)
β
Encoder
XβCodeword(length n)
Channel
εNoise ∼ N(0, σ2In)
Y Decoder u
Code design
n × N design matrix X
Received string: Y = Xβ + ε
Coefficient vector β is sparse is non-zero values
Not a linear code in algebraic coding sense
Barron, Joseph (Yale) Least Squares, Superposition codes 5 / 22
Sparse Superposition Codes
Model : Y = Xβ + ε
uMessage
(length K)
β
Encoder
XβCodeword(length n)
Channel
εNoise ∼ N(0, σ2In)
Y Decoder u
Code design
n × N design matrix X
Received string: Y = Xβ + ε
Coefficient vector β is sparse is non-zero values
Not a linear code in algebraic coding sense
Barron, Joseph (Yale) Least Squares, Superposition codes 5 / 22
Sparse Superposition Codes
X = . . .
β = (....., 1, ............. .............., 1, ... .... .............., 1, ...)
Code design
n × N matrix X with entries independent Normal(0,P/L)
β has exactly L, with L� N non-zero elements, all equal to 1
Average of ||Xβ||2 ≤ nP
Codeword is the sum of the selected columns
Barron, Joseph (Yale) Least Squares, Superposition codes 6 / 22
Partitioned Superposition Codes
X = . . .
B columns B columns B columns
Section 1 Section 2 .... Section L
β = (....., 1, ............. .............., 1, ... .... .............., 1, ...)
0,1,............................,B-1,0,1,............................,B-1
Code design
n × N matrix X with entries independent Normal(0,P/L)
Columns of X divided into L sections of size B. So N = LB
β has exactly one element non-zero in each section
Relationship: L, B , n and R
Number of codewords : BL
Length of message string K = L log2 B bits
Blocklength n = (1/R)L log B
Barron, Joseph (Yale) Least Squares, Superposition codes 7 / 22
Partitioned Superposition Codes
X = . . .
B columns B columns B columns
Section 1 Section 2 .... Section L
β = (....., 1, ............. .............., 1, ... .... .............., 1, ...)
0,1,............................,B-1,0,1,............................,B-1
Relationship: L, B , n and R
Number of codewords : BL
Length of message string K = L log2 B bits
Blocklength n = (1/R)L log B
Barron, Joseph (Yale) Least Squares, Superposition codes 7 / 22
Partitioned Superposition Codes
X = . . .
B columns B columns B columns
Section 1 Section 2 .... Section L
β = (....., 1, ............. .............., 1, ... .... .............., 1, ...)0,1,............................,B-1,0,1,............................,B-1
Relationship: L, B , n and R
Number of codewords : BL
Length of message string K = L log2 B bits
Blocklength n = (1/R)L log B
Barron, Joseph (Yale) Least Squares, Superposition codes 7 / 22
Encoding
0 1 1 1 0 0 0 0 1 1 0 1
{ { {
Non-zero elements of β:
7th element from Section 1
0th element from Section 2
13th element from Section 3
Map from Message u to β
Assume length of u, K = L log2 BFor ex. take L = 3 and log2 B = 4.
Split u into L subtrings of log2 B bits
Substrings give addresses of chosencolumns
For map : u → β,Choose non-zero elements of β in eachsection as given by these indices
Barron, Joseph (Yale) Least Squares, Superposition codes 8 / 22
Encoding
0 1 1 1 0 0 0 0 1 1 0 1{ { {
Non-zero elements of β:
7th element from Section 1
0th element from Section 2
13th element from Section 3
Map from Message u to β
Assume length of u, K = L log2 BFor ex. take L = 3 and log2 B = 4.
Split u into L subtrings of log2 B bits
Substrings give addresses of chosencolumns
For map : u → β,Choose non-zero elements of β in eachsection as given by these indices
Barron, Joseph (Yale) Least Squares, Superposition codes 8 / 22
Encoding
0 1 1 1 0 0 0 0 1 1 0 1{ { {
0111 0000 1101
Non-zero elements of β:
7th element from Section 1
0th element from Section 2
13th element from Section 3
Map from Message u to β
Assume length of u, K = L log2 BFor ex. take L = 3 and log2 B = 4.
Split u into L subtrings of log2 B bits
Substrings give addresses of chosencolumns
For map : u → β,Choose non-zero elements of β in eachsection as given by these indices
Barron, Joseph (Yale) Least Squares, Superposition codes 8 / 22
Encoding
0 1 1 1 0 0 0 0 1 1 0 1{ { {
7 0 13
Non-zero elements of β:
7th element from Section 1
0th element from Section 2
13th element from Section 3
Map from Message u to β
Assume length of u, K = L log2 BFor ex. take L = 3 and log2 B = 4.
Split u into L subtrings of log2 B bits
Substrings give addresses of chosencolumns
For map : u → β,Choose non-zero elements of β in eachsection as given by these indices
Barron, Joseph (Yale) Least Squares, Superposition codes 8 / 22
Encoding
0 1 1 1 0 0 0 0 1 1 0 1{ { {
7 0 13
Non-zero elements of β:
7th element from Section 1
0th element from Section 2
13th element from Section 3
Map from Message u to β
Assume length of u, K = L log2 BFor ex. take L = 3 and log2 B = 4.
Split u into L subtrings of log2 B bits
Substrings give addresses of chosencolumns
For map : u → β,Choose non-zero elements of β in eachsection as given by these indices
Barron, Joseph (Yale) Least Squares, Superposition codes 8 / 22
Decoding, Least Squares
Least Square Decoder
Find β which minimizes ||Y − Xβ||2 over β’s of the assumed form
Reliability
Want error β 6= β∗ to have small probability, when β∗ is sent
Small probability of Block error
Less stringent, want β to not equal to β∗ in at most α sections
Small probability that section error rate is greater than α
Barron, Joseph (Yale) Least Squares, Superposition codes 9 / 22
Decoding, Least Squares
Least Square Decoder
Find β which minimizes ||Y − Xβ||2 over β’s of the assumed form
Reliability
Want error β 6= β∗ to have small probability, when β∗ is sent
Small probability of Block error
Less stringent, want β to not equal to β∗ in at most α sections
Small probability that section error rate is greater than α
Barron, Joseph (Yale) Least Squares, Superposition codes 9 / 22
Decoding, Least Squares
Least Square Decoder
Find β which minimizes ||Y − Xβ||2 over β’s of the assumed form
Reliability
Want error β 6= β∗ to have small probability, when β∗ is sent
Small probability of Block error
Less stringent, want β to not equal to β∗ in at most α sections
Small probability that section error rate is greater than α
Barron, Joseph (Yale) Least Squares, Superposition codes 9 / 22
Decoding, Least Squares
Least Squares is the optimal decoder. It minimizes probability of errorwith uniform distribution on input strings
Analysis here of performance of Least Squares, without concern forcomputational feasibility
A computationally feasible algorithm discussed
Today, Session S-Fr-3, 2:40-4:00 p.m.
Barron, Joseph (Yale) Least Squares, Superposition codes 10 / 22
Decoding, Least Squares
Least Squares is the optimal decoder. It minimizes probability of errorwith uniform distribution on input strings
Analysis here of performance of Least Squares, without concern forcomputational feasibility
A computationally feasible algorithm discussed
Today, Session S-Fr-3, 2:40-4:00 p.m.
Barron, Joseph (Yale) Least Squares, Superposition codes 10 / 22
Performance of Least Squares
Result 1: Section size
To achieve rates up to capacity, the section size B need only be apolynomial in L.
In particular, B = Lav , where av is a function of only v .
av is decreasing function of vav is near 1 for large v
Dictionary size N = L1+av
Barron, Joseph (Yale) Least Squares, Superposition codes 11 / 22
Plot of av
v
sect
ion
size
rat
e
5 15 25 35 45 55 65 75 85 95
12
34
56
av
Barron, Joseph (Yale) Least Squares, Superposition codes 12 / 22
Performance of Least Squares
Result 2: Error Exponent
Letε = Prob{# section mistakes > αL}
be the probability that more than α fraction of sections are wrong.Then,
ε ≤ exp{−n cv min(α, (C−R)2)}
where cv is a constant that depends on only v .
Barron, Joseph (Yale) Least Squares, Superposition codes 13 / 22
Performance of Least Squares
From Partially Correct to Completely Correct Decoding
Let R be a rate for which the partitioned superposition code has
Prob{# section mistakes > αL} = ε.
Then through composition with an outer Reed-Solomon code, oneobtains a code with Rate Rtot = (1− 2α)R and
Prob{block error} = ε
Corollary 3: Block error probability
Taking α of order (C−R)2 we have
Prob{block error} ≤ exp{−n cv (C − Rtot)2}Optimal form of exponent as in Shannon & Gallager or Polyanskiyet.al
Barron, Joseph (Yale) Least Squares, Superposition codes 14 / 22
Performance of Least Squares
From Partially Correct to Completely Correct Decoding
Let R be a rate for which the partitioned superposition code has
Prob{# section mistakes > αL} = ε.
Then through composition with an outer Reed-Solomon code, oneobtains a code with Rate Rtot = (1− 2α)R and
Prob{block error} = ε
Corollary 3: Block error probability
Taking α of order (C−R)2 we have
Prob{block error} ≤ exp{−n cv (C − Rtot)2}Optimal form of exponent as in Shannon & Gallager or Polyanskiyet.al
Barron, Joseph (Yale) Least Squares, Superposition codes 14 / 22
Performance of Least Squares
Comparison with PPV curve
Polyanskiy, Poor and Verdu demonstrate that for the Gaussianchannel the following approximate relation holds for an optimal code
R ≈ C −√
V
nQ−1(ε) +
1
2
log n
n
V = (v/2)(v + 2) log2 e/(v + 1)2 is the channel dispersionQ = 1− Φ, where Φ is Gaussian distribution function
Barron, Joseph (Yale) Least Squares, Superposition codes 15 / 22
Comparison with PPV curve
0 100 200 300 400 500
0.0
0.5
1.0
1.5
2.0
2.5
3.0
Blocklength, n
Rat
e, R
bits
/ch.
use
CapacityPPV curveSuperposition code (Partitioned)SNR == 100Block error prob. =10−4
Barron, Joseph (Yale) Least Squares, Superposition codes 16 / 22
Superpositions for Multiple users vs our Single user channel
Cover introduced Superposition Codes for multi-user Gaussianchannels
Codeword sent is sum of codewords for respective users
Here we use superpositions for simplification of the single user channel
Barron, Joseph (Yale) Least Squares, Superposition codes 17 / 22
Relationship to Sparse Signal Recovery
Wainwright and others: Necessary and Sufficient conditions on n forsparsity recovery. When applied to our settings where N � L, thesegive that the n required is constant × L log(N/L)
It is natural to call (the reciprocal of) the best constant, for a givenset of allowed signals and given noise distribution, the compressedsensing capacity or signal recovery capacity.
For our setting n = (1/R)L log B so that the best constant is 1/C andthus the signal recovery capacity is equal to the Shannon capacity.
Barron, Joseph (Yale) Least Squares, Superposition codes 18 / 22
Error Probability Bound
Analysis
Least squares estimate β satisfies |Y − X β|2 ≤ |Y − Xβ∗|2
Error event of a fraction of α = `/L section mistakes, contained in
Eα = {|Y − Xβ|2 ≤ |Y − Xβ∗|2 for some β ∈Wrongα}
where Wrongα is the set of β differing from β∗ in fraction α sections.
Our bound:
Prob[Eα] ≤(
L
αL
)exp{−nDα}
Exponent Dα is sufficiently large to cancel the combinatorialcoefficient provided the B ≥ Lav and R < C .
Barron, Joseph (Yale) Least Squares, Superposition codes 19 / 22
Some ingredients of the error exponent
Analysis
Ingredients in Dα = D(∆α, ρ2α)
∆α = α(C − R) + (Cα − αC )
Cα = (1/2) log(1 + αv)
1−ρ2α = α(1−α)v/(1 + α2v)
D(∆, ρ2) is the cumulant generating function for (1/2)(Z 21 −Z 2
2 ) withZ1,Z2 bivariate normal, mean zero, unit variance and correlation ρ.
Near (1/2)∆2/(1−ρ2) for small ∆
Barron, Joseph (Yale) Least Squares, Superposition codes 20 / 22
Contributions to Error Exponent
020
4060
80
αα
erro
r ex
pone
nts
0 αα0 0.2 0.4 0.6 0.8 1
P(# mistakes >l0) = 1.8(10)−12
L = 100B = 8192P/σσ2 == 15C = 2R = 0.7Cn = 929N = 819200
Exponent of Prob(Eαα)
Barron, Joseph (Yale) Least Squares, Superposition codes 21 / 22
Summary
Sparse superposition coding is reliable at rates up to channel capacity
Error probability of the optimum form for R < C
Analysis blends modern statistical regression and information theory
Thank You
Barron, Joseph (Yale) Least Squares, Superposition codes 22 / 22