Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Design of High Rate RCM-LDGM Codes
by
Imanol Granada
Tecnun, University of Navarra (2017)
Submitted to the Biomedical Engineering and Science Departmentin partial fulfillment of the requirements for the degree of
Master of Science of Telecommunications Engineering
at
TECNUN
September 2017
Author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Biomedical Engineering and Science Department
September 29, 2017
Certified by. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Pedro M. Crespo
Full ProfessorPFG Supervisor
Design of High Rate RCM-LDGM Codes
by
Imanol Granada
Submitted to the Biomedical Engineering and Science Departmenton September 29, 2017, in partial fulfillment of the
requirements for the degree ofMaster of Science of Telecommunications Engineering
Abstract
This master thesis is studies the design of High Rate RCM-LDGM codes and it isdivided in two parts:
• In the first part, it proposes an EXIT chart analysis and a Bit Error Rate(BER) prediction procedure suitable for implementing high rate codes based onthe parallel concatenation of a Rate Compatible Modulation (RCM) code with aLow Density Generator Matrix (LDGM) code. The decoder of a parallel RCM-LDGM code is based on the iterative Sum-Product algorithm which exchangeinformation between variable nodes (VN) and the corresponding two types ofcheck nodes: RCM-CN and LDGM-CN. To obtain good codes that achieve nearShannon limit performance one is required to obtain BER versus SNR behaviorsfor different families of possible code design parameters. For large input blocklengths, this could take large amount of simulation time. To overcome thisdesign drawback, the proposed EXIT-BER chart procedure predicts in a muchfaster way this BER versus SNR behavior, and consequently speeds up theirdesign procedure.
• In the second part, it studies two different strategies for transmitting sourceswith memory. The first strategy consists on exploiting the source statistics inthe decoding process by attaching the factor graph of the source to the RCM-LDGM one and running the Sum-Product Algorithm to the entire factor graph.On the other hand, the second strategy uses the Burrows-Wheeler Transform toconvert the source with memory into several independent Discrete Memoryless(DMS) binary Sources and encodes them separately.
Thesis Supervisor: Pedro M. CrespoTitle: Full Professor
3
Acknowledgments
Es difıcil condensar en unas breves lıneas mis agradecimientos a tantas personas que
me han ayudado en estos anos de carrera. En primer lugar me gustarıa agradecer
a mi familia por todo el apoyo recibido durante toda mi vida y en especial en estos
anos de carrera. Quiero dar las gracias a mis padres Rafael y Antxoni por darme la
oportunidad de estudiar lo que elegı, a mi hermana Isabel por soportar mis tomaduras
de pelo y a los amigos con los que comparto los findes.
Por otro lado me gustarıa dar las gracias a mis companeros de clase, Dani, Unai,
Oier, Andres, Fernando y Pablo... y a mis amigos de la universidad que los ultimos
anos ya no estaban entre nosotros, Mireia, Santi, Doncel, Javier y Mikel... sin ellos
no hubieran sido lo mismo estos anos de universidad.
Por ultimo no puedo dejar de dar las gracias a Pedro Crespo y Xabier Insausti
por introducirme en este apasionante campo y gracias a Tecnun y al CEIT por todo
lo recibido durante estos anos. Gracias al departamento de Ingeniera Biomedica y
Ciencias por esos cafes en los que voy aprendiendo biologıa.
Gracias.
5
Contents
1 Introduction 13
2 Parallel RCM-LDGM Code 17
2.1 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2 RCM encoder block . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3 Parallel RCM-LDGM code . . . . . . . . . . . . . . . . . . . . . . . . 19
2.4 Decoder Block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.5 Example of RCM and parallel RCM-LDGM codes . . . . . . . . . . . 22
3 Exit Chart Analysis and BER prediction 25
3.1 Exit Chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.1.1 EXIT Curve of VND of the parallel RCM-LDGM code . . . . 27
3.1.2 EXIT Curve of CND of the parallel RCM-LDGM code . . . . 30
3.1.3 Trajectories of Iterative Decoding . . . . . . . . . . . . . . . . 31
3.1.4 Predicting the BER from the EXIT Chart . . . . . . . . . . . 32
3.2 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2.1 Bit error rate from the EXIT charts . . . . . . . . . . . . . . . 33
3.2.2 Performance of the BER estimation . . . . . . . . . . . . . . . 36
4 Strategies for Sources with Memory 39
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.1.1 Source Characterization . . . . . . . . . . . . . . . . . . . . . 40
4.2 Proposed Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
7
4.2.1 Strategy 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.2.2 Strategy 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.3.1 BER vs SNR of Strategy 1 . . . . . . . . . . . . . . . . . . . . 49
4.3.2 Strategy 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5 Conclusions 57
Appendices 58
A Obtaining κ 59
B Burrows Wheeler Transform (BWT) 63
B.1 Direct transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
B.2 Inverse BWT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
8
List of Figures
2-1 Bipartite graph representation of the RCM bit-to-symbol mapping. . 19
2-2 Factor graph representing the parallel RCM-LDGM code . . . . . . . 20
2-3 SNR vs BER behaviour of two RCM and parallel RCM-LDGM codes
obtained by Montecarlo simulations. . . . . . . . . . . . . . . . . . . 23
3-1 Random variable E(vn) models the extrinsic information produced at
the VND when A(vn) is the a priori information arriving at VND. Sim-
ilarly, E(cn) is the extrinsic information produced at the CND when
A(cn) and channel output are the a priori information arriving at the
CND. The Factor Graph of the VND and CND is given in Fig. 2-2. . 26
3-2 EXIT chart and predicted BER at crossing points at SNR= 24dB of
Scenario 1 with a uniform source (p0 = 0.5) and d(v)LDGM = 1, 2. . . . . 34
3-3 EXIT chart and predicted BER at crossing points of Scenario 2 with
a nonuniform source (p0 = 0.8) and d(v)LDGM = 1. . . . . . . . . . . . . 35
3-4 Simulated (solid curves) and predicted (dashed-squared curve) BER
versus SNR for the RCM-LDGM codes of the Scenario 2 with d(v)LDGM =
1 for uniform and nonuniform sources (p0 = 0.5, 0.8 and 0.95). . . . . 36
3-5 Simulated (solid curves) and predicted (dashed-squared curve) BER
versus SNR for the RCM-LDGM codes of the Scenario 1 with d(v)LDGM =
1 for uniform and nonuniform sources (p0 = 0.5, 0.8 and 0.95). Vertical
lines correspond to channel openings for the code with the same colour. 37
9
3-6 Simulated (solid curves) and predicted (dashed-squared curve) BER
versus SNR for the RCM-LDGM codes of the Scenario 1 with d(v)LDGM =
2 for uniform and nonuniform sources (p0 = 0.5, 0.8 and 0.95). Vertical
lines correspond to channel openings for the code with the same colour. 38
4-1 Conventional joint source-channel coding for a P2P communication
system of rate R = K/N (binary symbols per channel symbol). . . . . 39
4-2 BWT communication system. . . . . . . . . . . . . . . . . . . . . . . 40
4-3 Example of a Markov Chain of two states. . . . . . . . . . . . . . . . 41
4-4 Example of a hidden markov model with 3 states and two output sym-
bols ”1” and ”0”. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4-5 Factor graph of the parallel RCM-LDGM code with the source’s factor
graph attached. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4-6 Equivalent BWT communication system of Fig. 4-2. Note that K =∑γ
i=1Ki. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4-7 BER vs SNR for Strategy 1 with the sources of the examples with
H(S) = 0.53 and H(S) = 0.67 . . . . . . . . . . . . . . . . . . . . . . 50
4-8 Zero probability profile of a source with H(S) = 0.92 characterized by
a Markov Chain. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4-9 PER obtained by Montecarlo simulation for a source with H(S) = 0.92
characterized by a Markov Chain using Strategy 1 and Strategy 2. . . 52
4-10 Zero probability profile of a source with H(S) = 0.57 characterized by
a Markov Chain. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4-11 PER obtained by Montecarlo simulation for a source with H(S) = 0.57
characterized by a Markov Chain using Strategy 1 and Strategy 2. . . 54
A-1 Example of the iterations to find κ for the configuration of the scenario
1 with d(v)LDGM = 1 of the Section 3.2. The obtained histograms, con-
ditioned to U = 1, by Montecarlo simulations are plotted in blue and
the corresponding modeled conditional p.d.f.’s in red. . . . . . . . . . 60
10
List of Tables
3.1 Predicted BER values of Fig. 3-2 . . . . . . . . . . . . . . . . . . . . 35
3.2 Predicted BER values of Fig. 3-3 . . . . . . . . . . . . . . . . . . . . 36
4.1 Design parameters of Strategy 2 used in simulation results for H(S) =
0.92. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.2 Design parameters of Strategy 2 used in simulation results for H(S) =
0.57. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
B.1 Example of Burrows-Wheeler Transformation. . . . . . . . . . . . . . 64
B.2 Example of inverse Burrows-Wheeler Transformation. . . . . . . . . . 65
11
Chapter 1
Introduction
This master thesis considers the transmission of information of a given binary source
over an AWGN channels. From an information theoretical perspective this problem
can be separated into two independent subproblems. First, the actual amount of
information H, called entropy, that the binary source generates is characterized in
bits per source symbol. Using an ideally perfect lossless data compression algorithm,
exploiting the statistical redundancy of the source symbols the same information can
be represented in fewer bits, and the exact amount of bits is given by the entropy
of the source. Second, the maximum number of bits per channel symbol that any
transmitter can accept if these information bits are to be recovered at the receiver
side with low probability of error is characterized by the capacity, C, of the channel,
given by the maximum of the mutual information between the input and output of
the channel, where the maximization is with respect to the input distribution. The
algorithms to achieve near the capacity transmission rates are called channel codes. As
the Separation Theorem states, reliable communication of the information generated
by a source is possible if and only if H < C and it is optimal to solve each of the
subproblems individually. However, when considering practical issues like complexity
and delay, algorithms that treat the source compression and channel coding in a joint
fashion tend to perform better and therefore, are becoming more and more popular.
This family of algorithms are called Joint Source-Channel Codes (JSCC).
In particular, we consider the transmission at high spectral efficiency ρ (bits per
13
complex dimension) using a special type of JSCC based on the parallel concatenation
of a Rate Compatible Modulation code (RCM) [1], [2] and a Low Density Generator
Matrix (LDGM) code [3], [4]. In what follows we will refer these codes as parallel
RCM-LDGM. An RCM code generates random projections (RP) from weighted linear
combinations. Notice that it is similar to compressed sensing but with the difference
that in an RCM code the input sequence is binary. Due to the fine-grained bit
energy allocation and the resulting dense constellation, RCM codes are able to achieve
smooth rate adaption in a broad dynamic range. However, they present error floor at
high signal to noise (SNR) values. In order to solve this drawback, authors in [5], [6]
suggest using an LDGM code in parallel with an RCM code with the aim of having
lower error floors. Simulation results in [5], [6] show that the parallel RCM-LDGM
code outperforms RCM schemes significantly and that it presents performances close
to the Shannon limit if suitable design parameters are chosen. However, [5] and [6]
follow the brute force approach to search for good design parameters, which leads to
spending a lot of of time in simulations.
To overcome this problem, in the Chapter 3 we propose the use of EXIT charts and
BER predictions in order to speed up the process of finding good design parameters for
RCM codes and parallel RCM-LDGM codes when the information of a memoryless
source is transmitted. EXIT charts were first introduced by Brink to analyze and
design an iterative coding scheme [7]. Later, authors in [8] propose a curve fitting
procedure based on EXIT chart to design an LDPC code valid for modulation and
detection. Due to the iterative decoding nature of RCM and parallel RCM-LDGM
codes, EXIT charts are a good method for visually exploring the iterative exchange
of information that occur in the decoders of those codes. It should be mentioned that
the authors in [9] were the first to use EXIT charts as a designed aide of RCM codes.
However, they only consider uniform binary sources. It should me remark that due
to the concatenation of two different types of codes (RCM and LDGM), the previous
work in EXIT charts can not be directly applied to the current case.
The novel contributions of this chapter are twofold:
1. Design procedure based on EXIT charts for parallel RCM-LDGM codes suitable
14
for binary memoryless, both uniform and non-uniform.
2. BER prediction for parallel RCM-LDGM code based on EXIT charts.
In Chapter 4, considering the same scenario we study the transmission of sources
with memory. The source are modeled with Markov Models or Hidden Markov Mod-
els and two different strategies to exploit the time dependence are presented and
compared. Strategy 1 uses a standard RCM-LDGM encoder, but exploits the source
statistics in the decoding process by attaching the factor graph of the source [11] to
the code’s one and running the Sum-Product Algorithm to the entire factor graph.
On the other hand, the second strategy uses the Burrows Wheeler Transformation
[13] to eliminate the time dependence of the source and converts the overall problem
into the transmission of several independent DMS [14] binary sources over an AWGN
channel. This allows optimal spectral efficiency and energy allocation to be used by
the encoder before the transmission improving the system performance.
Finally, the conclusions are presented in the Chapter 5.
For the sake of clarity in the exposition, the Chapter 2 begins by providing a
succinct overview of the key concepts of an RCM code before covering parallel RCM-
LDGM codes.
15
Chapter 2
Parallel RCM-LDGM Code
This chapter presents a brief introduction to RCM and parallel RCM-LDGM codes
and presents some results of previous works on how these two codes perform. First,
we begin by defining the system model.
2.1 System Model
Consider a point-to-point communications system where a binary memoryless source
with distribution (p0; p1) transmits K bits u = (u1, u2, . . . , uK)> ∈ {0, 1}K×1, across
an AWGN channel, to a far end receiver. To that end, the source symbols u are
encoded by a rate R = K/N (bits per real dimension) parallel RCM-LDGM encoder
and QAM modulated before being transmitted. Let
x = (x1 + ix2, x3 + ix4, . . . , xN−1 + ixN)
be the sequence of N/2 complex baseband modulated symbols to be transmitted,
where xi ∈ R denote the coded symbols at the output of the RCM-LDGM encoder.
Assuming independence of the coded symbols {xi}, a set of sufficient statistics to
estimate u is given by the output of an equivalent discrete time AWGN channel
17
y = (y1, y2, . . . , yN)> ∈ RN×1,
yi = xi + ni, i ∈ {1, 2, . . . , N}
where {ni}Ni=1 are realizations of i.i.d real gaussian random variables, with zero mean
and variance N0/2 (i.e, Ni ∼ N (0, N0/2)). At the receiver side, the decoder estimates
the source symbols u from y.
2.2 RCM encoder block
An RCM code of rate K/M bits per real dimension is characterized1 by an M ×Ksparse generator matrix G. Let D ⊂ N be a multiset2 with d
(c)RCM/2 elements where
N is the set of natural numbers (positive integers). The entries of G belong to ±D,
and we next explain the way as it is constructed with an example.
As an example, if d(c)RCM = 8, and assuming K divisible of by d
(c)RCM , then the
construction of matrix G is given by the following steps:
1. Construct the K/2×K matrix G0 as
G0 =
Π(Dd3) Π(Dd4) Π(Dd1) Π(Dd2)
Π(Dd1) Π(Dd2) Π(Dd3) Π(Dd4)
Π(Dd4) Π(Dd3) Π(Dd2) Π(Dd1)
Π(Dd2) Π(Dd1) Π(Dd4) Π(Dd3)
,
where Π(·) denotes random column permutations of a matrix, and Ddl is a
K/8×K/4 sparse matrix given by:
Ddl =
dl −dl 0 0 0 0 . . . 0 0
0 0 dl −dl 0 0 . . . 0 0
0 0 0 0 dl −dl . . . 0 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
....
.
.
.
.
.
.
0 0 0 0 0 0 0 dl −dl
1without loss of generality, it is assumed that K is an even integer2A multiset is a generalization of the concept of a set that, unlike a set, allows multiple instances
of the multiset’s elements.
18
with dl ∈ D, for l ∈ {1, . . . , d(c)RCM/2}.
2. Vertically stack as many G0 as needed (and only as many rows as needed from
the last one) until the required M ×K G matrix is obtained.
Observe that d(c)RCM gives the number of nonzero entries of any row of G. Similarly,
we denote by d(vk)RCM ≥ 2 the number of nonzero entries of column k of matrix G, and
by d(v)
RCM its average value, i.e, d(v)
RCM = 1K
∑Kk=1 d
(vk)RCM .
Given the message u, the jth RCM symbol, xj, is obtained as
xj = [Gu]j,1 j ∈ 1, . . . ,M
where [ · ]j,1 is the element at row j of the column 1.
u1
u2
u3
uK−1
uK
Source bits
.
.
.
.
.
.
.
.
.
+ x1
+ x2
+ x3
+ xM
RCM symbols
G2;1
G1;2
G3;2
GM;2
Gj;i
Figure 2-1: Bipartite graph representation of the RCM bit-to-symbol mapping.
2.3 Parallel RCM-LDGM code
A parallel RCM-LDGM code of rate K/(M +I) bits per real dimension consist of the
parallel concatenation of an RCM code of rate K/M with a high rate binary regular
19
LDGM code that produces I non-systematic coded binary symbol from its K input
binary symbols (refer to Fig. 2-2). That is, the encoded symbol sequence is given by3
xᵀ =
[(Gu)ᵀ |
(2 · ((Pu mod 2)− 1
2)
)ᵀ]
where G is the M ×K RCM matrix introduced in Section 2.2) and whose entry at
the j(th) row and the i(th) column is gj,i. P is a I ×K sparse matrix with with binary
entries. The matrix P is characterized by the pair (d(v)LDGM , d
(c)LDGM), denoting the
number of nonzero elements of a column and of a row, respectively. Recall that the
objective of the LDGM code is to reduce the error floor produced by the RCM code
but without degrading the RCM waterfall region.
U1 U2 U3 UK
T1 T2 TNTM
U4
x1 x2 xM xM+1 xM+2 xN−1 xN
I = N −M coded bits of LDGMM RCM symbols
qk,iri,k
Figure 2-2: Factor graph representing the parallel RCM-LDGM code
2.4 Decoder Block
At the receiver side, upon receiving y, the decoder approximates a MAP detector by
finding the estimate uk of uk, for k ∈ {1, 2, . . . , K} as
uk = arg maxuk={0,1}
p(uk|y), (2.1)
3Obersve that the last I symbols are encoded using a 2-PAM modulation.
20
where p(·|·) is the conditional probability mass function of uk given y. To find these
conditional probabilities, the sum-product algorithm (SPA) [10] is applied to the
factor graph that models the overall communications system. This factor graph is
sketched in Fig. 2-2. Let rj,k and qk,j denote the passing log-likelihood ratios (LLR)
messages from the jth check node (CN) to the kth variable node (VN), and from the
kth VN to the jth CN, respectively. In what follows, we denote by n(Uk) \ Tj and
n(Tj) \Uk the set of CN connected to VN Uk without considering CN Tj, and the set
of VN connected to CN Tj without considering VN k, respectively.
At each iteration t, the sum-product algorithm is implemented by the sequential
execution of the following steps:
• STEP 1. q(t)k,i: Message passing from variable nodes {Uk}Kk=1 to RCM
and LDGM check nodes {Ti}M+Ii=1 .
q(t)k,i =
∑
j∈n(Uk)\Ti
r(t−1)j,k + log
(p1p0
)
where r(0)j,k = 0 for k ∈ {1, . . . , K} and j ∈ n(Uk) \ Ti and (p1; p0) is the distri-
bution of the memoryless binary source.
• STEP 2. r(t)i,k: Message passing from RCM-LDGM check nodes, {Ti}M+I
i=1 ,
to variable nodes {Uk}Kk=1.
– Computation at RCM check nodes {Ti}Mi=1: Observe that xj =∑
i gj,iui
and define aj,k =∑
i∼k gj,iui. Combining both terms we get xj = aj,k +
gj,kuk for all k ∈ n(Tj), and the received symbol yj = xj +nj. The message
r(t)i,k is calculated as r
(t)i,k =
log
(∑z P
(t)(aj,k = z) · e−(yj−z−gj,k)2/N0
∑z P
(t)(aj,k = z) · e−(yj−z)2/N0
)(2.2)
where the sum in z is over all possible values that the RCM symbols can
take. Notice that P (t)(aj,k = z), the probability of aj,k = z at iteration t, is
calculated in a straightforward manner by convolving terms added, where
21
the distribution functions of the terms in the summation are obtained
from the received LLR messages q(t)k,i. An efficient way to implement these
convolutions is explained in [1].
– Computation at LDGM check nodes {Ti}M+Ii=M+1: As in standard
LDGM codes, the LLR message transmitted from the ith check node to
the variable node Uk is given by r(t)i,k =
2atanh
tanh
(γi2
) ∏
j∈n(Ti)\Uk
tanh
(q(t)k,j
2
)
where γi =(yi + 1)2 − (yi − 1)2
N0
.
At the end of the iterations, when t = tmax, an estimate of uk (see 2.1) can be
calculated as
uk =
1,(∑
j∈n(Uk) r(tmax)j,k
)> 0
0, otherwise
2.5 Example of RCM and parallel RCM-LDGM
codes
In this section we show the BER vs SNR behaviour of RCM and parallel RCM-LDGM
codes. To that end, we consider a spectral efficiency of ρ = 7.4 with source blocks
of K = 37000 uniformly distributed bits. Therefore, the RCM code will contain
10000 RCM symbols. For the parallel RCM-LDGM code, I has been chosen to be
200, leading to 9800 RCM symbols in parallel to the 200 LDGM ones. Two different
weight sets are considered, D1 = {1, 2, 4, 4} and D2 = {2, 3, 4, 8}.As it is observed in the Fig. 2-3 the RCM codes suffer from a large error floor
which is corrected when some LDGM symbols are concatenated in parallel. The
substitutions of few RCM symbols for the LDGM ones (maintaining the spectral
efficiency) leads to a worse behavior at low SNR. However, at a given SNR, each
22
22 23 24 25 26 27 28
SNR
10-4
10-3
10-2
10-1B
ER
RCM 1 2 4 4
RCM-LDGM 1 2 4 4
RCM 2 3 4 8
RCM-LDGM 2 3 4 8
Figure 2-3: SNR vs BER behaviour of two RCM and parallel RCM-LDGM codesobtained by Montecarlo simulations.
parallel code outperforms its RCM counterpart.
Finally, it can be seen that with a good selection of parameters, the RCM-LDGM
codes can perform close to the Shannon limits. The obtaining of the BER vs SNR
curves by montecarlo simulations is computationally expensive, and therefore, the
Chapter 3 develops an alternative way to design good codes based on Exit Charts.
23
Chapter 3
Exit Chart Analysis and BER
prediction
The selection of good design parameters for the codes used in any point-to-point
communications system is crucial to achieve performances close to the Shannon fun-
damental limit. For example, when considering the case of a single RCM code one
has to select a good weight set D. For a given set of design parameters, authors
in [5, 6], perform BER vs. SNR simulations of the overall communications chain.
This procedure was repeated until a good combination of parameters was found. The
drawback of this procedure is that it takes long computational time. To overcome this
problem, the authors in [9] proposed an EXIT chart analysis for a fast performance
prediction of design parameters, shortening in this way the RCM parameter selection
procedure. As already mentioned, this EXIT chart analysis cannot be directly ap-
plied since two different types of check nodes RCM and LDGM are blend together.
This paper extends the analysis to parallel RCM-LDGM codes, considering as well
nonuniform sources. Furthermore, it also presents a BER prediction analysis based
on EXIT charts that was not previously considered in the literature for this type of
codes.
25
3.1 Exit Chart
The model for EXIT chart analysis is composed of two types of decoders (refer to Fig.
3-1): variable node decoder (VND) composed of all variable nodes, and check node
decoder (CND) composed of two different types of check nodes: RCM and LDGM.
The LLR values communicating between the two decoders are modeled as outcomes
of real-valued random variables E (output from either a VND or CND) and A (input
from either a VND or CND).
CND VNDchannelRCM-LDGM Y
encodersource π
A(vn)
E(vn)A(cn)
E(cn)
U
I(U ;E(cn)) = I(U ;A(vn))
I(U ;A(cn)) = I(U ;E(vn))
X
decoder
U
Figure 3-1: Random variable E(vn) models the extrinsic information produced at theVND when A(vn) is the a priori information arriving at VND. Similarly, E(cn) is theextrinsic information produced at the CND when A(cn) and channel output are the apriori information arriving at the CND. The Factor Graph of the VND and CND isgiven in Fig. 2-2.
To characterize the behavior of a node decoder, either check or variable, we de-
scribe the mutual information I(E;U) between the decoder’s LLR extrinsic output
E and a binary source symbol U with distribution (p1; p0), as a function of the mu-
tual information I(A;U) between the decoder’s LLR a priori input A and U . These
mutual informations are given by,
I(L;U) = p0
∫ ∞
−∞fL(ξ|0)log2
(fL(ξ|0)
p0fL(ξ|0) + p1fL(ξ|1))
)dξ
+ p1
∫ ∞
−∞fL(ξ|1)log2
(fL(ξ|1)
p0fL(ξ|0) + p1fL(ξ|1))
)dξ (3.1)
where L ∈ {A,E} and fL(ξ|u), for u = 0, 1, is the conditional probability density
function of L given U and should be taken according to the node decoder under
consideration, that is, whether it is a VND or a CND (These conditional p.d.f.s are
developed in Sections 3.1.1 and 3.1.2). In the sequel, we will denote I(L;U) for a
26
VND or a CND as IL,VND = I(L(vn);U) or IL,CND = I(L(cn);U), respectively.
In the course of deriving the EXIT chart for the parallel RCM-LDGM code, we
will need the following parameters:
p(vn)RCM ,
M · d(v)RCMM · d(v)RCM + I · d(v)LDGM
and (3.2)
p(cn)RCM ,
d(c)RCM
d(c)RCM + d
(c)LDGM
(3.3)
which they give the average percentage of edge connections arriving to a VN from
a RCM check node, and the percentage of edge connections arriving to RCM check
node from a VN, respectively.
3.1.1 EXIT Curve of VND of the parallel RCM-LDGM code
The EXIT curve of the VND is given by the transfer characteristic between IE,VND =
I(E(vn);U) and IA,VND = I(A(vn);U). Note that the realizations of both r.v.’s are
the passing messages of the sum-product algorithm, {ri,k} and {qk,i}, respectively. In
order to evaluate these mutual informations from (3.1), the conditional p.d.f. of the
a priori A(vn) and the extrinsic E(vn) at a variable node decoder, given U have to be
found.
Note that as opposed to previous works on EXIT charts, in an RCM-LDGM code
one has to consider two types of a priori messages arriving at a VND: first, the
messages arriving from an edge connected to a RCM check node, A(vn)RCM , and second,
the messages arriving from and edge connedted to a LDGM check node, A(vn)LDGM .
In order to simplify calculations, it is assumed in [9] that the conditional p.d.f. of
A(vn)RCM is modeled as the LLR r.v. obtained at the output of a virtual AWGN channel
when its input are uniform1 binary source symbols U , i.e.,
Y = U +N, N ∼ N(0, σ2
). (3.4)
1The a priori probability of the source symbols is already implied in the STEP 1 of the SPA.
27
Therefore, the LLR a priori message A(vn)RCM at a variable node is
A(vn)RCM = log
(P (u = 1|Y )
P (u = 0|Y )
)=
2U − 1
2σ2+N
σ2.
Having said that, the following assumptions have been made in our case:
A(vn)RCM |U ∼ N
((2u− 1)(σ2
A)/2, (σ2A))
(3.5)
A(vn)LDGM |U ∼ N
((2u− 1)(σ2
B)/2, (σ2B)). (3.6)
In other words, the conditional p.d.f.’s of A(vn)RCM and A
(vn)LDGM are modeled as the LLR
r.v. obtained at the output of two virtual AWGN channels Y1 = U+N1
(N1 ∼ N
(0, 1
σ2A
))
and Y2 = U + N2
(N2 ∼ N
(0, 1
σ2B
)), respectively, when their inputs are the binary
source symbols U .
The EXIT chart for the VND is now defined by the application f
f : R2+ −→ [0, 1]× [0, 1]
(σ2A, σ
2B) −→ (IA,VND, IE,VND)
As it explained latter, given (σ2A, σ
2B) the computation of the mutual informations
IA,VND and IE,VND are based on the conditional p.d.f.’s A(vn)|U in (3.9) and E(vn)|Uin (3.14), respectively, which in turn are derived from distributions (3.5) and (3.6).
The main difference of having two types of CN’s rather than one as in the case of
a standard EXIT chart, is that these mutual informations depend on two variables
σ2A, σ
2B instead of on one, σ2
A.
Observe that if variables σ2A, σ
2B were taken independently, this f would define a
closed surface R in [0, 1] × [0, 1]. However, these two variables are not independent
since their actual values are sequentially obtained at each iteration of the sum product
algorithm. Therefore, for each code, an implicit curve Fσ(σ2A, σ
2B) = 0 in the R2
+
domain plane will result and consequently its image in f will produce an EXIT
28
implicit VND curve FI(I(A,U), I(E,U)) = 0 in [0, 1]× [0, 1].
Calculating this exact implicit trajectory is computationally expensive which is
against the objective of any EXIT chart analysis. Simulations have shown that the
implicit curve Fσ = 0 can be linearly approximated Therefore, in order to further
simplify our model, we will assume Fσ(σ2A, σ
2B) = 0 is modeled as σ2
B −σ2A
κ= 0. This
yields,
A(vn)RCM |U ∼ N
((2u− 1)
(σ2A)
2, (σ2
A)
)(3.7)
A(vn)LDGM |U ∼ N
((2u− 1)
(σ2A)
2 · κ ,(σ2
A)
κ
). (3.8)
The constant κ scales the variance of the distribution of A(vn)LDGM with respect to
the variance of A(vn)RCM . The steps to compute κ are explained in the Appendix A.
Let us now go back to the computation of the p.d.f. of the a priori messages A(vn)
at a VND. Since we have two types a priori messages, the corresponding conditional
p.d.f of A(vn)|U , is given by the compound conditional p.d.f.:
A(vn)|U ∼ fA(vn)(a|u) =
fA
(vn)RCM
(a|u)p(vn)RCM + f
A(vn)LDGM
(a|u)(1− p(vn)RCM) (3.9)
where the gaussian fA
(vn)RCM
, fA
(vn)LDGM
are given in (3.7), (3.8) and p(vn)RCM in (3.2).
Once fA(vn)(a|u) is known, the conditional distribution of the extrinsic E(vn) r.v.
at the variable node decoder can be computed. To this end, note that there are
two type of the LLR messages: messages passed on an edge connecting a V N to a
RCM check node (modeled by a r.v. denoted by E(vn)RCM), and messages passed on an
edge connecting a V N to a LDGM check node (model by a r.v. E(vn)LDGM). From the
corresponding connections of factor graph, yields,
29
E(vn)RCM = (d
(v)
RCM − 1)A(vn)RCM + d
(v)LDGMA
(vn)LDGM + log
(p1p0
)(3.10)
E(vn)LDGM = d
(v)
RCMA(vn)RCM + (d
(v)LDGM − 1)A
(vn)LDGM + log
(p1p0
)(3.11)
and the corresponding conditional p.d.f.
fE
(vn)RCM
(e|u) = N
(2u− 1)
σ2A
(d(v)
RCM − 1 +d(v)LDGM
κ
)
2+ log
(p1p0
), σ2
A
(d(v)
RCM − 1 +d(v)LDGM
κ
)
(3.12)
fE
(vn)LDGM
(e|u) = N
(2u− 1)
σ2A
(d(v)
RCM +d(v)LDGM−1
κ
)
2+ log
(p1p0
), σ2
A
(d(v)
RCM +d(v)LDGM − 1
κ
)
(3.13)
Again, since we have two types of extrinsic messages, the overall conditional p.d.f.
of the extrinsic LLR r.v.’s E(vn)|U is given by the compound conditional p.d.f.,
E(vn)|U ∼ fE(vn)(e|u) =
fE
(vn)RCM
(e|u)p(cn)RCM + f
E(vn)LDGM
(e|u)(1− p(cn)RCM) (3.14)
where p(cn)RCM is given in (3.3). From fA(vn)(a|u) and fE(vn)(e|u), the mutual informa-
tions IA,VND and IE,VND are obtained as parametric expressions of σ2A.
3.1.2 EXIT Curve of CND of the parallel RCM-LDGM code
From the fact that the a priori information A(cn) at the check node decoder equals to
the extrinsic information E(vn) at the variable node decoder (refer to Fig. 3-1)), the
distribution f(cn)A (a|u) of A(cn) is given by the p.d.f. in (3.14), interchanging E by A,
30
that is,
A(cn)|U ∼ fA(cn)(a|u) =
fA
(cn)RCM
(a|u)p(cn)RCM + f
A(cn)LDGM
(a|u)(1− p(cn)RCM) (3.15)
From (3.1) we get IA,CND.
To compute IE,CND, we need to find the the conditional p.d.f. fE(cn)(e|u) of the
extrinsic LLR information E(cn) at the CND. This is done by running Step 2 (Section
2.4) of the sum-product algorithm and setting qk,i = a, where a are realizations of a
random variable A(cn) with conditional p.d.f. (3.15). The empirical conditional p.d.f.
fE(cn)(e|u) is now found by the histogram of the realizations {ri,k}.
3.1.3 Trajectories of Iterative Decoding
To account for the iterative nature of the suboptimal SPA, both the VND and CND
transfer characteristics should be plotted into a single diagram. However, the axis
of the transfer characteristics of the VND are swapped. As long as the SNR is large
enough so that both transfer curves do not intersect, the iterative process will achieve
its maximum mutual information values, that is, (H(p0), H(p0)), and consequently
achieving a low BER. Therefore, the code design problem reduces to find weight sets
D, I and d(v)LDGM that the intersections of the curves first occurs at SNR as close as
possible to the corresponding Shannon limit. For more details on this procedure, we
refer to the simulation results section.
Remark 1 : Note that the EXIT Curve of VND only depends on the values of
d(v)
RCM and d(v)LDGM . On the other hand, the EXIT Curve of the CND depends on all
the parameters of the code and source, i.e., {D, SNR, d(v)
RCM , d(v)LDGM , d
(c)LDGM , M , I}.
Remark 2 : The EXIT chart for a RCM code can be calculated as a particular
case of the parallel LDGM-RCM where p(vn)RCM = 1 and p
(cn)RCM = 1.
31
3.1.4 Predicting the BER from the EXIT Chart
The EXIT chart can be used to obtain an estimate on the BER after an arbitrary
number of iterations. Following the sum-product algorithm, the LLR value of the
decision variable, sk, of the variable node k at the end of a set of iterations, is obtained
as the sum of all LLR messages ri,k that were passed over a single edge connecting a
CN, i, with the corresponding VN, k, that is, sk =∑
i ri,k for i ∈ n(Uk). From the
previous assumptions, and for the sake of deriving a simple formula on the bit error
probability Pb, ri,k can be considered to be a realization of independent Gaussian
random variables A(vn)RCM and A
(vn)LDGM . Its conditional p.d.f. given U is
S|U ∼ N(µS(u), σ2
S
).
with σ2S = d
(v)
RCM · σ2A +
d(v)LDGM
κ· σ2
A, and µS(u) = (2u − 1)σ2S
2+ log
(p1p0
). The BER
performance is now obtained as
Pb = p0P (S > 0|U = 0) + p1P (S ≤ 0|U = 1) . (3.16)
Observe that
P (S > 0|U = 0) = 1−Q(µS(0)
σS
)and,
P (S ≤ 0|U = 1) = Q
(µS(1)
σS
)
where Q(ξ) is the Q function
Q(ξ) =1√2π
∫ ∞
ξ
e−y2
2 dy.
Remark 3 : The BER for a RCM code can be estimated as a particular case of the
parallel LDGM-RCM with d(v)LDGM = 0.
32
3.2 Simulation Results
In this section, we apply the EXIT chart analysis and BER prediction method pro-
posed in Section 3.1 for finding good RCM-LDGM codes in order to reduce their
design simulation time. The considered scenarios are the following:
1. A spectral efficiency of ρ = 7.4 with source blocks of K = 37000 bits and coded
as M = 9800 RCM and I = 200 LDGM symbols.
2. A spectral efficiency of ρ = 4 with source blocks of K = 25000 and coded as
M = 11350 RCM and I = 135 LDGM symbols.
We consider six different weight sets: D1 = {1, 2, 4, 4}, D2 = {2, 3, 4, 8} and
D3 = {2, 3, 7, 10} and a P matrix with two different values of d(v)LDGM = 1 and
d(v)LDGM = 2 for the Scenario 1 and D4 = {1, 1, 1, 1, 2, 2}, D5 = {1, 1, 2, 2, 4, 4} and
D6 = {2, 2, 3, 3, 4, 4} and a P matrix with d(v)LDGM = 1 for the Scenario 2.
Therefore, nine different codes are analyzed. Furthermore, in each scenario three
different sources will be considered, one uniform p0 = 0.5, and two nonuniform with
p0 = 0.8 and 0.95. We begin by analyzing the EXIT chart.
3.2.1 Bit error rate from the EXIT charts
As explained in the section 3.1.4, an estimated BER can be assigned to each point of
the Variable Node (VN) curve of the Exit chart. Therefore, the BER of a particular
code at a given SNR is obtained from the value of the VN point where both curves
intersect. This is illustrated in Figs. 3-2 and 3-3 where EXIT charts of both scenarios
are plotted and the corresponding BER values of the VN at the intersection points are
shown. The EXIT chart of Fig. 3-2 corresponds to Scenario 1 with a uniform source
and d(v)LDGM = 1 and 2. On the other hand, Fig. 3-3 corresponds to the Scenario 2 for
a nonuniform source with entropy H(0.2) = 0.72 where only the case of d(v)LDGM = 1
is considered. The predicted BER points of these two examples are summarized in
tables 3.1 and 3.2.
33
Note that when there is no crossing point in the EXIT chart (i.e., there is an open
channel between curves), the LDGM part is supposed to correct all errors and in this
case a low BER is supposed (refer to Section 3.2.2). An example of an open channel
is D2 of Fig. 3-2 for both d(v)LDGM = 1 and 2.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
0.2
0.4
0.6
0.8
1
0.58 0.59 0.6 0.61
0.33
0.34
0.35
0.62 0.64 0.66
0.2450.25
0.2550.26
0.265
9.0e-02
8.7e-02
7.0e-02 6.7e-02
Figure 3-2: EXIT chart and predicted BER at crossing points at SNR= 24dB ofScenario 1 with a uniform source (p0 = 0.5) and d
(v)LDGM = 1, 2.
Recall that the VN curve of the EXIT chart depends on M , I, d(v)LDGM and d
(c)RCM ,
whereas the CND curve depends also on the actual values of D and the SNR. There-
fore, it is only possible to compare codes with different weights and/or SNR in a
single diagram. An example of how the EXIT chart depends on d(v)LDGM is illustrated
in Fig 3-2.
As it can be observed from Fig. 3-3, increasing the SNR moves the CND curve
upwards, postponing the crossing point between curves and therefore, predicting a
lower BER.
The parameter κ of our Gaussian assumption (3.8) was obtained using the steps
3.1.1 for each Scenario and d(v)LDGM . For the first Scenario the value obtained for κ
34
Code Predicted BER
D1 d(v)LDGM = 1 9e− 2
D1 d(v)LDGM = 2 8.7e− 2
D2 d(v)LDGM = 1 Open channel
D2 d(v)LDGM = 2 Open channel
D3 d(v)LDGM = 1 7e− 2
D3 d(v)LDGM = 2 6.78e− 2
Table 3.1: Predicted BER values of Fig. 3-2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
0
0.2
0.4
0.6
0.8
1
0.56 0.58 0.6 0.62 0.64 0.66
0.180.2
0.220.240.260.28
2.4 e-04
4.3 e-023.0 e-03
2.3 e-02
1.6 e-02
1.1 e-02
Figure 3-3: EXIT chart and predicted BER at crossing points of Scenario 2 with anonuniform source (p0 = 0.8) and d
(v)LDGM = 1.
was approximately 30 and 110 for d(v)LDGM = 1 and d
(v)LDGM = 2, respectively. For the
second Scenario the value of κ obtained is 2.3.
It should be mentioned that the EXIT curves always end at (H(p0), H(p0)), where
H(p0) is the entropy of the source. This can be seen for the nonuniform source in
Fig. 3-3 (H = 0.72).
35
Code Predicted BERD4@10dB 1.6e− 2D4@12dB 2.4e− 4D5@10dB 2.3e− 2D5@12dB 3e− 3D6@10dB 1.1e− 2D6@12dB 4.2e− 4
Table 3.2: Predicted BER values of Fig. 3-3
3.2.2 Performance of the BER estimation
Next, we analyze the performance of our estimation for the parallel RCM-LDGM
codes. To that end, we compare the BER obtained by Monte-Carlo simulation with
those obtained from the EXIT chart analysis.
2 4 6 8 10 12 14 16
SNR
10-4
10-3
10-2
10-1
BE
R
p0 = 0.8p0 = 0.95 p0 = 0.5
Figure 3-4: Simulated (solid curves) and predicted (dashed-squared curve) BER ver-
sus SNR for the RCM-LDGM codes of the Scenario 2 with d(v)LDGM = 1 for uniform
and nonuniform sources (p0 = 0.5, 0.8 and 0.95).
Note that when the SNR is such that there is no crossing point between EXIT
curves, the channel is open and for these cases the actual BER will be very small
36
10 12 14 16 18 20 22 24 26
SNR
10-4
10-3
10-2
10-1B
ER
p0 = 0.8 p0 = 0.5p0 = 0.95
Figure 3-5: Simulated (solid curves) and predicted (dashed-squared curve) BER ver-
sus SNR for the RCM-LDGM codes of the Scenario 1 with d(v)LDGM = 1 for uniform
and nonuniform sources (p0 = 0.5, 0.8 and 0.95). Vertical lines correspond to channelopenings for the code with the same colour.
but difficult to predict. This is shown as vertical lines in the figures with the colour
of the code that opens at that SNR. For example, for the code with weight set
D2 = {2, 3, 4, 8}(colour red) shown in Fig. 3-5, and for the case of a uniform source
the channel opens at SNR 24dB.
As it can be seen in Figs. 3-4, 3-5 and 3-6, for those SNR that make the EXIT
curves intersect (i.e., in the high BER region), our BER prediction is accurate in
both scenarios. On the other hand, for those SNRs where there is an open channel
(i.e., low BER region), the BER is set to zero since no other prediction is possible.
Observe that, in these cases, our prediction of the waterfall region is always better
than the actual simulated BER. A possible explanation is that, given a rate, the
EXIT chart only depends on the ratio between RCM and LDGM symbols and not
on the particular length of the blocks that can be assumed to be infinite length. On
37
10 12 14 16 18 20 22 24 26
SNR
10-4
10-3
10-2
10-1B
ER
p0 = 0.8 p0 = 0.5p0 = 0.95
Figure 3-6: Simulated (solid curves) and predicted (dashed-squared curve) BER ver-
sus SNR for the RCM-LDGM codes of the Scenario 1 with d(v)LDGM = 2 for uniform
and nonuniform sources (p0 = 0.5, 0.8 and 0.95). Vertical lines correspond to channelopenings for the code with the same colour.
the other hand, simulations are performed with finite blocks lengths of 37.000 bits.
In addition, observe that the codes with d(v)LDGM = 2 present a wider gap between our
prediction and the actual BER curve. A possible reason is that when d(v)LDGM = 2,
cycles are more likely to appear in the factor graph when compared to codes with
d(v)LDGM = 1.
In either case, the proposed technique allows to find good codes without extensive
simulations: In order to obtain a BER value around 5·10−5 by Montecarlo techniques,
at least 2 ·106 bits (i.e, 54 blocks of 37000 bits) have to be simulated in order to count
100 errors. The simulation of each block requires many iterations (between 50-75) of
the steps in 2.4. On the contrary, our proposed technique only requires one iteration
per σ2A (where σ2
A takes a few equally spaced values in some appropriated interval),
resulting in a speed up of around two orders of magnitude.
38
Chapter 4
Strategies for Sources with
Memory
4.1 Introduction
We consider the transmission of a block of K binary symbols, U = {Uk}Kk=1, generated
by a time correlated binary source S, over an AWGN channel, with a noise Z, to a
destination. For that, we assume that the source S follows a Markov Chain (MC)
with γ states or a Hidden Markov Model. We denote H (S) its entropy rate. For the
transmission of U = {Uk}Kk=1, we use a Joint Source Channel Code (JSCC) composed
of an encoder E and a decoder D with rate R = K/N . This is represented in Fig. 4-1.
Z
H (S))AWGN
S JSC
E
C
JSC
D UK N
Figure 4-1: Conventional joint source-channel coding for a P2P communication sys-tem of rate R = K/N (binary symbols per channel symbol).
39
We would like to compare the performance of two different communications sys-
tems, namely, the standard point to point (P2P) communications system shown in
Fig. 4-1 with that of Fig 4-2, where the Burrows Wheeler Transform (BWT) is ap-
plied to U = {Uk}Kk=1 before its encoding step.
Z
H (S))AWGN
S JSCEC
JSCD UU BWT BWT
−1
Figure 4-2: BWT communication system.
First of all, we begin by defining both types of sources.
4.1.1 Source Characterization
We statistically characterize the sequence u = {uk}Kk=1 by the probability mass distri-
bution (p.m.d.), P (u1, . . . , uK), of the source. Without loss of generality, P (u1, . . . , uK)
can be approximated by a time invariant Markov Chain (MC) or a Hidden Markov
Model (HMM) with a sufficient number of states.
Markov Chain
A discrete stochastic process {Sk}∞k=1 is a Markov Chain (MC) with Sk as the state
at instant k if
PSk|Sk−1...S1 (sk|sk−1, . . . s1) =PSk|Sk−1(sk|sk−1) , (4.1)
for k ∈ {1, . . . ,∞}. Furthermore, the MC is time invariant if (4.1) does not depend on
k. A time invariant MC is characterized by the set of parameters λMC , {Sλ, A,Γ},where
• Sλ is the number of states in the chain. That is, Sk ∈ {1, 2, . . . Sλ}, ∀k.
40
• A is a Sλ×Sλ state transition probability matrix with entries ai,j, where ai,j ,
PSk|Sk−1(j|i), for i, j ∈ {1, . . . , Sλ}, satisfying
∑Sλj=1 ai,j = 1, ∀i ∈ {1, . . . , Sλ}.
• Γ is a Sλ × 1 initial state probability vector with components PS1(i), i ∈{1, . . . , Sλ}. For a stationary MC, Γ equals to the stationary distribution of
the chain, i.e., Γ = AΓ.
S1S2
a1;1 a2;2
a1;2
a2;1
S0
PS1(1) PS1
(2)
Figure 4-3: Example of a Markov Chain of two states.
Hidden Markov Model
A very useful generalization of the MC model is the Hidden Markov Model [11], [12].
In this case the mapping between the states and source symbols is probabilistic rather
than deterministic. Therefore, an HMM model is defined by the set of parameters
λHMM , {Sλ, A,B,Γ}, where
• Sλ, A and Γ are defined as for the MC case.
• B is a Sλ × |U| output distribution probability matrix with entries bi,j ,
PUk|Sk(j|i), for j ∈ U and i ∈ {1, . . . Sλ}. Notice that U is the alphabet of
the source.
41
S1S2
a1;1 a2;2
a1;2
a2;1
S0
PS1(1) PS1
(2)
S3a3;3
a2;3
a3;2
PS1(3)
a1;2
a3;1Hidden states
Visible source symbols
1 0
b1;1 b2;1b1;3 b2;3b1;2 b2;2
Figure 4-4: Example of a hidden markov model with 3 states and two output symbols”1” and ”0”.
Since we are going to work with bits, we will consider Markov Chains with two
states or HMM with any number of hidden states but U = {0, 1} (|U| = 2).
4.2 Proposed Strategies
In this section we explain two novel strategies when the joint RCM-LDGM code is
used to transmit binary symbols of a source with memory.
4.2.1 Strategy 1
In Strategy 1, we encode the correlated source symbols using the encoder block in-
troduced in Section 2.2 but we modify the decoder block introduced in Section 2.4 to
exploit the correlation of the source. More precisely, we propose attaching the factor
graph that models the source to the factor graph that represents the code and the
42
Uk−1 UkU1 UK
Sk−1Sk−2 Sk
... ...
... ...
Tk−1 TkTk−2 Tk+1
qk;k rk;k
T1 T2 TM TM+1 TM+I... ...
x1 x2 xM xM+1 xM+I
RCM symbols LDGM symbols
SK−1
TK
S1S0
T1
Figure 4-5: Factor graph of the parallel RCM-LDGM code with the source’s factorgraph attached.
AWGN channel, and run the sum product algorithm to the whole factor graph. This
is represented in Fig. 4-5 and an iteration of the SPA of this factor graph consists of
the sequential execution of the following steps:
• STEP 1: Message passing from variable nodes {Uk}Kk=1 to function
nodes of RCM and LDGM, {Ti}Ni=K+1. This step is exactly the as STEP 1
of Section 2.4.
• STEP 2: Message passing from function nodes of RCM and LDGM,
{Ti}Ni=K+1, to variable nodes {Uk}Kk=1. This step is exactly the as STEP 2 of
Section 2.4.
• STEP 3: Message passing from variable nodes {Uk}Kk=1 to function
nodes of HMM, {Tk}Kk=1.
qk,k =∑
j∈n(Uk)
rj,k
• STEP 4: Message passing from function nodes of HMM, {Tk}Kk=0, to
variable nodes {Uk}Kk=1. We first need to compute the forward and backward
43
recursions (α (sk) , β (sk−1)). That is,
α(sk) =∑
{∼sk}
Tk(sk−1, sk, uk)qk,kα(sk) (4.2)
β(sk−1) =∑
{∼sk}
Tk(sk−1, sk, uk)qk,kβ(sk), (4.3)
where Tk (sk−1, sk, uk), for k = 0, 1, . . . , K, is the kth factor node of HMM, which
is given by
Tk (sk−1, sk, uk) =
f(sk) for k = 0
f(xk|sk−1) for k = K
f(sk|sk−1)f(xk|sk−1) else
Taking into account (4.2) and (4.3), we have that
P (uk = 0) =∑
{∼uk}
Tk(sk−1, sk, uk = 0)α(sk−1)β(sk)
P (uk = 1) =∑
{∼uk}
Tk(sk−1, sk, uk = 1)α(sk−1)β(sk).
Hence,
rk,k = log
(P (uk = 1)
P (uk = 0)
)
4.2.2 Strategy 2
Strategy 2 proposes applying the Burrows Wheeler Transform (BWT), explained in
Section 4.2.2, to the correlated symbols before and after the encoding and decoding
steps respectively. This is shown in Fig. 4-2.
Before explaining Strategy 2, we first introduce the key concepts of the BWT.
44
Burrows Wheeler Transform (BWT)
Let T = {Tk}Kk=1, Tk ∈ {0, 1} denote the output sequence of the BWT of {Uk}Kk=1.
Authors in [14] show that the probability distribution of T = {Tk}Kk=1, QT (t), is
approximately piecewise memoryless, i.e.,
QT (t) =
Sλ∏
i=1
wi−1∏
k=wi−1
Pi(tk),
where wi ∈ N and w0 = 1 and wM = K+1. Notice that for a source characterized by
a Markov chain, Sλ defines the number of piecewise memoryless intervals Li defined
as, Li = [wi−1 − wi). Each of these intervals has a length of Ki = |Li| bits, where
K =∑Sλ
i=1Ki, and is characterized by a first order probability distribution:
Pi(tk) ,
p0i if tk = 0
p1i if tk = 1
where k ∈ Li. Furthermore, the ratios Ki/K do not depend on K.
Therefore, we can consider that the output sequence of the BWT has been gen-
erated by S1,S2 . . . ,SSλ independent DMS binary sources, each with an entropy of
Hi = −p0i log p0i − (1− p0i ) log(1− p0i ), i = 1, 2, . . . , Sλ.
Observe that by the independence of the sources and their symbols, the entropy rate
of the original source can be expressed as
H (S) =
Sλ∑
i=1
Ki
KHi. (4.4)
That is, each source Si generates Ti = {Tk}wi−1k=wi−1binary symbols of length Ki, such
that T = {T1,T2, . . . ,TSλ}.
45
Source Splitting Scheme
Based on this source splitting scheme and on time division multiplexing, Strategy 2
converts the conventional JSCC-P2P-AWGN(N0/2) scheme of rateR = C(SNR)/H(S)
for the transmission of the binary symbols generated by a memory source S(H(S))
into Sλ parallel JSCC-P2P-AWGN(N0/2) of rates Ri = C(SNRi)/Hi each driven by
a memoryless source Si(Hi) and some SNRi. This is shown in Fig. 4-6. This is shown
in Fig. 4-6. The following proposition finds the minimum average SNR over the Sλ
parallel channels under the restriction that∑Sλ
i=1Ri = R.
S1 H1 C
E1Z
AWGN
JSCD1
K1 T1
S2 H2 C
E2Z
AWGN
JSCD2
K2 T2
Sλ Hλ C
EγZ
AWGN
JSCDγ
KγTγ
H1
H2
Hγ
C(SNR1)
C(SNR2)
C(SNRγ)
Nγ
N1
N2
Figure 4-6: Equivalent BWT communication system of Fig. 4-2. Note that K =∑γi=1Ki.
Theorem 1. For any given rate R = K/N , the optimal signal to noise ratio allocation
of each parallel JSCC represented in Fig. 4-6 equals to the signal to noise ratio of the
standard P2P communications system shown in Fig. 4-1. For that, each partial rate
46
of each parallel JSCC code of Fig. 4-6 equals to Ri = RH(S)Hi
.
Proof. Given a set of signal-to-noise ratios {SNRi}γi=1, the rates for the JSCC encoders
in Fig. 4-6 are given by the Shannon’s separation theorem as:
Ri =Ki
Ni
=C(SNRi)
Hi
, i = 1, . . . , γ
where in our case K =∑γ
i=1Ki.
We sought to minimize the overall signal to noise ratio of the γ parallel AWGN
channels, i.e.,
SNR =
γ∑
i=1
Ni∑γj=1Nj
SNRi =
γ∑
i=1
Ki
N
Hi
C(SNRi)SNRi
under the constrain that the total rate equal to the rate of conventional system, that
is,
R =K
N=
γ∑
i=1
Ki∑γj=1Nj
Therefore, the constrain reduces to
N =
γ∑
j=1
Nj =
γ∑
j=1
HjKj
C(SNRj)(4.5)
Using the Lagrange multipliers method, we define F as
F =∑
i
KiHi
N
SNRi
C (SNRi)+ λ
(∑
i
KiHi
C (SNRi)−N
),
We search for an extreme of F , i.e.,
∂F
∂SNRi
= 0
47
Since
∂F
∂SNRi
= KiHi
( C(SNRi)N
− λ+SNRi2 log(2)(1+SNRi)
C (SNRi)2
)= 0
the set of {SNR∗i }γi=1 that minimize F is given given by solving this equation
C(SNR∗i )
N=
λ+ SNR∗i2 log(2)(1 + SNR∗i )
(4.6)
From expression (4.6) we arrive to the conclusion that the optimal SNR∗i are all
equal to some SNR∗. Using the constraint that the total rate equals the rate of the
conventional system, expressed in (4.5), we arrive to
N =∑
i
Ni =∑
i
KiHi
C (SNR∗)=
KH (S)
C (SNR∗)
Therefore, we arrive thatK
N=C(SNR∗)
H (S)
Since KN
= C(SNR)H(S) , we conclude that SNR∗i = SNR∗ = SNR, where SNR = 2
2KNH(S)−
1 is the signal-to-noise ratio used by the conventional system of Fig. 4-1 to achieve
the given rate KN
.
Finally, the optimal coding rate for each parallel JSCC in Fig. 4-6 is given by
Ri =Ki
Ni
=C(SNR∗)
Hi
=RH (S)
Hi
.
4.3 Results
In this section we evaluate the performance of Strategy 1 and Strategy 2. We fix ρ7.4
bits per two dimensions and input block lengths of 37000.
48
4.3.1 BER vs SNR of Strategy 1
We will begin by showing BER vs SNR results, obtained by Montecarlo simulations,
for two Hidden Markov Models, one with a higher time-correlated source than the
other.
Example 1. Consider a source with H(S) = 0.53 that is characterized with a Hidden
Markov Model with the following parameters:
• Sλ = 2
• Transition probabilities: a1,1 = 0.9, a2,2 = 0.5.
• Output probabilities: a1,1 = 0.95, a2,2 = 0.5.
After performing numerous simulations of different design parameters of the joint
RCM-LDGM codes, the combination of weight set D = {2, 3, 4, 8} with M = 9800
RCM, I = 200 LDGM symbols and d(v)LDGM = 1 turned out to be the best combination.
The result is shown in the Fig. 4-7 plotted in blue.
Example 2. Consider another source with H(S) = 0.67 that is characterized with a
Hidden Markov Model with the following parameters:
• Sλ = 2
• Transition probabilities: a1,1 = 0.9, a2,2 = 0.6.
• Output probabilities: a1,1 = 0.95, a2,2 = 0.8.
Again, after performing numerous simulations the combination of weight set D =
{2, 3, 4, 8} with M = 9800 RCM, I = 200 LDGM symbols and d(v)LDGM = 1 was
selected. The result is shown in the Fig. 4-7 plotted in red.
As it can be observed in the Fig. 4-7 using the information of the source in the
decoder improves the BER vs SNR behavior, however, the distance to the Shannon
limit is increased. The vertical lines represent the Shannon limits when transmitting
sources with H(S) = 0.53 (blue), H(S) = 0.67 (red) and H(0.5) = 1 (dark). The
dark curve represents the BER vs SNR behaviour when no source information is used
in the decoder, i.e. it is assumed to be uniformly distributed.
49
10 15 20 25
SNR
10-7
10-6
10-5
10-4
10-3
10-2
10-1
100
BE
R
No HMMHMM1HMM2
Figure 4-7: BER vs SNR for Strategy 1 with the sources of the examples withH(S) =0.53 and H(S) = 0.67
4.3.2 Strategy 2
It has been shown in 4.3.1 that using the source information in the decoder improves
the BER vs SNR curve, but it is inefficient in respect to the Shannon limit. Contrary
to the Strategy 1, the Strategy 2 exploits the information of the source to optimize
the encoder. Here we will show the performance of the Strategy 2 and compare with
Strategy 1. In order to test the strategies, two different Markov Models will be used.
Due to the nature of the BWT (refer to Appendix B for more information) one
single error in the transmission chain before performing the inverse BWT leads to an
arbitrary number of errors. Therefore, we will measure the Packet Error Rate (PER)
instead of the BER.
Example 3. Consider a source with H(S) = 0.92 that is characterized with a Markov
Chain with the following parameters:
50
• Sλ = 2
• Transition probabilities: a1,1 = 0.6, a2,2 = 0.7.
For a source block of 37000 bits, the zero probability profile after applying the BWT is
depicted in the Fig. 4-8. Notice that we can distinguish Sλ = 2 intervals, one with a
priori probability p0 = 0.4 and the other with a priori probability p0 = 0.7. Using 4.4
we can approximate the 37000 source symbols generated by the source with memory
H(S) = 0.92 to K1 = 16444 source symbols generated by a memoryless source with
H(S1) = 0.97 and K2 = 20556 source symbols generated by a another memoryless
source with H(S2) = 0.88
0 0.5 1 1.5 2 2.5 3 3.5 4
K 104
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Zer
o P
roba
bilit
y P
rofil
e
Figure 4-8: Zero probability profile of a source with H(S) = 0.92 characterized by aMarkov Chain.
The PER vs SNR results for this source are shown in the Fig. 4-9. For the
Strategy 1, after performing many simulations the joint RCM-LDGM code with the
51
weight set D2 = {2, 3, 4, 8} with M = 9900 RCM and I = 100 LDGM symbols with
d(v)LDGM = 3 turned out to be the best configuration. As for the Strategy 2, following
the Theorem 1 two encoders of ρ1 = 7.02 and ρ2 = 7.74 have to be used. Using
the EXIT chart analysis method of the Chapter 3 we arrive to the following design
parameters:
Weight set M I d(v)LDGM
K1 = 16444 {2, 3, 4, 8} 4596 90 1K2 = 20556 {2, 3, 4, 8} 5204 110 1
Table 4.1: Design parameters of Strategy 2 used in simulation results for H(S) = 0.92.
20 20.5 21 21.5 22 22.5 23 23.5 24 24.5
SNR
10-3
10-2
10-1
100
PE
R
Strategy 2Strategy 1Shannon limit
Figure 4-9: PER obtained by Montecarlo simulation for a source with H(S) = 0.92characterized by a Markov Chain using Strategy 1 and Strategy 2.
Example 4. Consider another source with H(S) = 0.57 that is characterized with a
Markov Chain with the following parameters:
52
• Sλ = 2
• Transition probabilities: a1,1 = 0.7, a2,2 = 0.9.
For a source block of 37000 bits, the zero probability profile after applying the BWT is
depicted in the Fig. 4-10. Notice that we can distinguish Sλ = 2 intervals, one with a
priori probability p0 = 0.3 and the other with a priori probability p0 = 0.9. Using 4.4
we can approximate the 37000 source symbols generated by the source with memory
H(S) = 0.57 to K1 = 9024 source symbols generated by a memoryless source with
H(S1) = 0.88 and K2 = 27976 source symbols generated by a another memoryless
source with H(S2) = 0.47
0 0.5 1 1.5 2 2.5 3 3.5 4
K 104
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Zer
o P
roba
bilit
y P
rofil
e
Figure 4-10: Zero probability profile of a source with H(S) = 0.57 characterized bya Markov Chain.
The PER vs SNR results for this source are shown in the Fig. 4-11. For the
Strategy 1, after performing many simulations the joint RCM-LDGM code with the
53
weight set D2 = {1, 1, 2, 2} with M = 9900 RCM and I = 100 LDGM symbols with
d(v)LDGM = 3 turned out to be the best configuration. As for the Strategy 2, following
the Theorem 1 two encoders of ρ1 = 4.79 and ρ2 = 8.97 have to be used. Using
the EXIT chart analysis method of the Chapter 3 we arrive to the following design
parameters:
Weight set M I d(v)LDGM
K1 = 9024 {1, 1, 2, 2} 3736 30 3K2 = 27976 {1, 1, 1, 1, 2, 2, 2, 2} 6184 50 3
Table 4.2: Design parameters of Strategy 2 used in simulation results for H(S) = 0.57.
12 13 14 15 16 17 18 19 20 21 22
SNR
10-3
10-2
10-1
100
PE
R
Strategy 2Strategy 1Shannon limit
Figure 4-11: PER obtained by Montecarlo simulation for a source with H(S) = 0.57characterized by a Markov Chain using Strategy 1 and Strategy 2.
As it can be observed in the Fig. 4-9 and Fig. 4-11 the Strategy 2 that uses
the source information to optimize the encoding process outperforms the Strategy 1.
When the source correlation is low, i.e. the Example 3, both strategies perform in
54
a similar manner for SNR = 24.5dB. However, if the source correlation is high, i.e.
the Example 4, the Strategy 2 outperforms the Strategy 1.
55
Chapter 5
Conclusions
The conclusions are twofold:
• An EXIT chart analysis for RCM-LDGM codes has been developed. This type
of codes are very well suited for high-throughput transmission of uniform and
non-uniform binary memoryless sources. However, when long block lengths
are considered, the search of good design parameters using a brute force ap-
proach is time consuming. The proposed EXIT analysis allows to reduce the
computational complexity by around two orders of magnitude, while the BER
predictions are very close to the results obtained through simulations, especially
at relatively high values of the BER.
• Two different strategies for transmitting time correlated symbols using the
RCM-LDGM family of codes have been analyzed. Strategy 1 takes into ac-
count the source correlation only in the receiver by applying the SPA to the
factor graph composed of the code’s factor graph and the source’s factor graph
attached. On the other hand, the Strategy 2 applies the BWT before the codifi-
cation step to transform the source with memory into non uniformly distributed
memoryless symbols and optimizes their encoding. The Strategy 2 outperform
the Strategy 1, specially when the time correlation is high, since it takes advan-
tage of the source information in the encoder.
57
Appendix A
Obtaining κ
The constant κ is computed by Monte Carlo simulation through the following iterative
procedure:
1. Start with an initial value of κ in (3.8), say κ0 = 1, and choose a σ2A such that
the corresponding value of the mutual information computed by the p.d.f. in
(3.7) is in the range (0.5,0.9). For the given σ2A, generate the extrinsic messages
passed from the VN to the LDGM and RCM check nodes according to (3.11)
and (3.10) respectively.
2. Run one iteration of the sum product algorithm to obtain the extrinsic LLR
messages passed from of each LDGM and RCM check nodes to VN, and obtain
their empirical conditional p.d.f.
3. Define κ1 as the ratio between the variances of the empirical conditional distri-
butions of RCM and LDGM obtained in step 2.
4. Starting with κ0 = κ1, repeat the previous 3 steps until κ0 ≈ κ1.
5. Set κ = κ1 in the distribution (3.8).
It has been found that in all simulated cases done that the number of iterations before
convergency is around three.
59
-20 0 20 40 600
0.02
0.04Empirical p.d.f.Modeled p.d.f.
-10 0 10 20 300
0.2
0.4Empirical p.d.f.Modeled p.d.f.
-20 0 20 40 600
0.02
0.04Empirical p.d.f.Modeled p.d.f.
-10 0 10 20 300
0.2
0.4Empirical p.d.f.Modeled p.d.f.
-20 0 20 40 600
0.02
0.04Empirical p.d.f.Modeled p.d.f.
-10 0 10 20 300
0.2
0.4Empirical p.d.f.Modeled p.d.f.
Figure A-1: Example of the iterations to find κ for the configuration of the scenario1 with d
(v)LDGM = 1 of the Section 3.2. The obtained histograms, conditioned to
U = 1, by Montecarlo simulations are plotted in blue and the corresponding modeledconditional p.d.f.’s in red.
Fig. A-1 shows a graphical example of the steps followed to find κ for a particular
example. The initial obtained empirical conditional p.d.f.’s plotted in blue (i.e., when
κ = 1) are shown in the first row of the image. Note that since κ = 1 the approximated
conditional p.d.f.’s of (3.7), (3.8) plotted in red are the same. As it can be observed
none of the LLR messages is appropriately modeled at this point since the κ = 1 was
arbitrarily chosen.
Following step 3 results in κ = 43, and by repeating the above procedure the
corresponding empirical conditional distributions, shown in the second row of the
figure, are obtained. Observe, that for this value of κ = 43, the messages are better
modeled by (3.7), (3.8), however, there is still room for improvement. The second
iteration results in κ = 26. The corresponding empirical conditional distributions are
shown in the third row of the image. If we perform an additional iteration, it will
60
Appendix B
Burrows Wheeler Transform
(BWT)
The Burrow-Wheeler transform is a reversible algorithm that rearranges a character
string into runs of similar characters. It can be applied to any character string and is
usually used to prepare data for use of data compression techniques. We will explain
the algorithm through an example of the bit string ”110100”.
B.1 Direct transformation
First of all, an ’End of File’ (?) pointer has to be added to the string resulting in the
new string ”110100?” of length 8. The transform is done by sorting all rotation of
the string in lexicographical order and taking the last column.
63
Input All Rotations Sorted Strings Output
110100? 00?1101
?110100 0100?11
0?11010 0?11010
110100? 00?1101 100?110 11001?0
100?110 10100?1
0100?11 110100?
10100?1 ?110100
Table B.1: Example of Burrows-Wheeler Transformation.
And finally the output bit string is ”110010”. Note that the algorithm to be
reversible, the position of the EOF (?), needs to be known. This could be done by
adding extra bits with the position of the EOF (?) in the sequence. Let N be the
length of the sequence, the spectral efficiency of the transmission will be affected
by a factor ofN
N + log2(N)and therefore, it will not taken into account for large
sequences.
B.2 Inverse BWT
The first step of the inverse transform is to take the last bits that represent the EOF
(?) character and place it correctly. Imagine that you have the sorted strings table
of the BWT transformation and erase everything but the last column. Given only
this information, you can easily reconstruct the first column. The last column tells
you all the characters (bits in the example) in the text, so just sort these characters
alphabetically to get the first column. Then, the first and last columns (of each row)
together give you all pairs of successive characters in the document, where pairs are
taken cyclically so that the last and first character form a pair. Sorting the list of pairs
gives the first and second columns. Continuing in this manner, you can reconstruct
the entire list. Then, the row with the EOF (?) character at the end is the original
string.
64
Start Sort1 Add2 Sort2 Add3 Sort3 Sort8 Output
1
1
0
0
1
?
0
0
0
0
1
1
1
?
10
10
00
01
11
?1
0?
00
01
0?
10
10
11
?1
100
101
00?
010
110
?11
0?1
00?
010
0?1
100
101
110
?11
...
00?1101
0100?11
0?11010
100?110
10100?1
110100?
?110100
110100
Table B.2: Example of inverse Burrows-Wheeler Transformation.
65
Bibliography
[1] H. Cui, C. Luo, J. Wu,, C. W. Chen and F. Wu, ”Compressive Coded Modulation
for Seamless Rate Adaption”, IEEE Trans. on Wireless Communications, pp.
4892-4904, October 2013.
[2] H. Cui, C. Luo, K. Tan, F. Wu and C. W. Chen, ”Seamless Rate Adaption for
Wireless Networking”, in Proceedings 14th ACM MSWiM’11, pp. 437-446, 2011.
[3] W. Zhong and J. Garcia-Frias, ”LDGM Codes for Channel Coding and Joint
Source-Channel Coding of Correlated Sources”, EURASIP Journal on Applied
Signal Processing, pp. 942-953, May 2005.
[4] W. Zhong, H. Chai and J. Garcia-Frias, ”Approaching The Shannon Limit though
Parallel Concatenation of Regular LDGM Codes”, in Proceedings ISIT’05, Sept.
2005.
[5] L. Li, J. Garcia-Frias, ”Hybrid Analog Digital Coding Scheme Based on Parallel
Concatenation of Liner Random Projections and LDGM Codes”, CISS’14, March
2014.
[6] L.Li, J. Garcia-Frias, ”Hybrid Analog-Digital Coding for Nonuniform Memoryless
Sources”, CISS’15, March 2015.
[7] S. Ten Brink, ”Convergence behavior of iteratively decoded parallel concatenated
codes, IEEE Trans. Commun., vol. 49, no. 10, pp.1727-1737, Oct. 2001.
67
[8] S. Ten Brink, G. Kramer and A. Ashikmin, ”Design of low density parity check
codes for modulation and detection”, IEEE Trans. Commun., vol. 52, no. 4,
pp.670-678, Apr. 2004.
[9] J. Wu, Z. Teng, H. Cui, C. Luo, X. Huang and H.-H. Chen, ”Arithmetic-BICM for
Seamless Rate Adaption for Wireless Communications Systems”, IEEE Systems
Journal, vol. 10, no. 1, March 2016.
[10] F. R. Kschischang, B. J. Frey, and H.- A. Loeliger, ”Factor Graphs and the
Sum-Product Algorithm”, IEEE Trans. on Information Theory, vol. 47, no. 2,
pp. 498-519, February 2001.
[11] L.R. Rabiner, ”A Tutorial on Hidden Markov Models and Selected Applications
on Speech Recognition”, Proceedings of the IEEE, vol. 77, oo. 257-285, February
1989.
[12] Y. Ephraim and N. Merhav, ”Hidden Markov Processes”, IEEE Transactions on
Information Theory, vol. 48, pp. 1518-1569, June 2002.
[13] M. Burrows and D. Wheeler, ”A Block Sorting Lossless Data Compression Al-
gorithm”, Research Report 124, Digital Systems Center, 1994.
[14] K. Visweswariah, S. Kulkarni and S. Verdu, ”Output Distribution of the Bur-
rows Wheeler Transform”, in Proceedings of the IEEE International Symposium
of Information Theory, June 2000.
[15] S. Kullback and R. A. Leibler, ”On Information and Sufficiency”, Annal of Math-
ematical Statistics, vol. 22, pp. 79-86, 1951.
68