On the Application of LDPC Codes to Arbitrary Discrete-Memoryless Channels_IEEE 2004

Embed Size (px)

Citation preview

  • 7/27/2019 On the Application of LDPC Codes to Arbitrary Discrete-Memoryless Channels_IEEE 2004

    1/22

    IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 50, NO. 3, MARCH 2004 417

    On the Application of LDPC Codes to ArbitraryDiscrete-Memoryless Channels

    Amir Bennatan, Student Member, IEEE, and David Burshtein, Senior Member, IEEE

    AbstractWe discuss three structures of modified low-densityparity-check (LDPC) code ensembles designed for transmissionover arbitrary discrete memoryless channels. The first structure isbased on the well-known binary LDPC codes following construc-tions proposed by Gallager and McEliece, the second is based onLDPC codes of arbitrary ( -ary) alphabets employing modulo-addition, as presented by Gallager, and the third is based on LDPCcodes defined over the field GF

    ( )

    . All structures are obtained byapplying a quantization mapping on a coset LDPC ensemble. Wepresent tools for the analysis of nonbinary codes and show that allconfigurations, under maximum-likelihood (ML) decoding, arecapable of reliable communication at rates arbitrarily close to thecapacity of any discrete memoryless channel. We discuss practicaliterative decoding of our structures and present simulation resultsfor the additive white Gaussian noise (AWGN) channel confirmingthe effectiveness of the codes.

    Index Terms -ary low-density parity check (LDPC), beliefpropagation, coset codes, iterative decoding, LDPC codes, turbocodes.

    I. INTRODUCTION

    LOW-density parity-check (LDPC) codes and iterative de-

    coding algorithms were proposed by Gallager [12] four

    decades ago, but were relatively ignored until the introduction

    of turbo codes by Berrou et al. [2] in 1993. In fact, there is a

    close resemblance between LDPC and turbo codes. Both are

    characterized by a LDPC matrix, and both can be decoded it-eratively using similar algorithms. Turbo-like codes have gen-

    erated a revolution in coding theory, by providing codes that

    have capacity approaching transmission rates with practical it-

    erative decoding algorithms. Yet turbo-like codes are usually

    restricted to single-user discrete-memoryless binary-input sym-

    metric-output channels.

    Several methods for the adaptation of turbo-like codes to

    more general channel models have been suggested. Wachs-

    mann et al. [31] have presented a multilevel coding scheme

    using turbo codes as components. Robertson and Wrts [25]

    and Divsalar and Pollara [9] have presented turbo decoding

    methods that replace binary constituent codes with trellis-coded

    modulation (TCM) codes. Kavcic, Ma, Mitzenmacher, and

    Manuscript received November 11, 2002; revised November 14, 2003. Thiswork was supported by the Israel Science Foundation under Grant 22/01-1 andby a fellowship from The Yitzhak and Chaya Weinstein Research Institute forSignal Processing at Tel-Aviv University. The material in this paper was pre-sented in part at the IEEE International Symposium on Information Theory,Yokohama, Japan, June/July 2003.

    The authors are with the Department of Electrical EngineeringSys-tems, Tel-Aviv University, Tel-Aviv, RamatAviv 69978, Israel (e-mail:[email protected]; [email protected]).

    Communicated by R. Koetter, Associate Editor for Coding Theory.Digital Object Identifier 10.1109/TIT.2004.824917

    Varnica [16], [19], and [29] have suggested a method that

    involves a concatenation of a set of LDPC codes as outer codes

    with a matched-information rate (MIR) inner trellis code,

    which performs the function of shaping, an essential ingredient

    in approaching capacity [11].

    In his book, Gallager [13], described a construction that can

    be used to achieve capacity under maximum-likelihood (ML)

    decoding, on an arbitrary discrete-memoryless channel, using

    a uniformly distributed parity-check matrix (i.e., each element

    is set to uniformly, and independently of the other ele-

    ments). In his construction, Gallager used the coset-ensemble,

    i.e., the ensemble of all codes obtained by adding a fixed vectorto each of the codewords of any given code in the original

    ensemble. To obtain codes over large alphabets, Gallager

    also proposed mapping groups of bits from a binary code

    into channel symbols. This mapping, which we have named

    quantization, can be designed to overcome the shaping

    gap to capacity. Gallager suggested these ideas in a general

    context. Coset LDPC codes were also used by Kavcic et al.

    [15] in the context of intersymbol-interference (ISI) channels.

    McEliece [20] discussed the application of turbo-like codes to

    a wider range of channels, and proposed to use Gallager-style

    mappings. A uniform mapping from -tuples to a constellation

    of signals was used in [28] and [5] in the contexts of

    multiple-input, multiple-output (MIMO) fading channels, andinterference mitigation at the transmitter, respectively.

    In [12], Gallager proposed a generalization of standard binary

    LDPC codes to arbitrary alphabet codes, along with a practical

    iterative soft-decoding algorithm. Gallager presented an anal-

    ysis technique on the performance of the code that applies to

    a symmetric channel. LDPC codes over an arbitrary alphabet

    were used by Davey and Mackay [8] in order to improve the

    performance of standard binary LDPC codes. Hence, both con-

    structions target symmetric channels.

    In this paper, we discuss three structures of modified LDPC

    code ensembles designed for transmission over arbitrary (not

    necessarily binary-input or symmetric-input) discrete memory-

    less channels. The first structure, called binary quantized coset(BQC) LDPC, is based on [13] and [20]. The second structure

    is a modification of [12], that utilizes cosets and a quantization

    map from the code symbol space to the channel symbol space.

    This structure, called modulo quantized coset (MQC) LDPC,

    assumes that the parity-check matrix elements remain binary,

    although the codewords are defined over a larger alphabet. The

    third structure is based on LDPC codes defined over Galois

    fields GF and allows parity-check matrices over the entire

    range of field elements. This last structure is called GF quan-

    tized coset (GQC) LDPC.

    0018-9448/04$20.00 2004 IEEE

  • 7/27/2019 On the Application of LDPC Codes to Arbitrary Discrete-Memoryless Channels_IEEE 2004

    2/22

    418 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 50, NO. 3, MARCH 2004

    Fig. 1. Encoding of BQC-LDPC codes.

    We present tools for the analysis of nonbinary codes and de-rive upper bounds on the ML decoding error probability of the

    three code structures. We then showthat these configurations are

    good in the sense that for regular ensembles with sufficiently

    large connectivity, a typical code in the ensemble enables re-

    liable communication at rates arbitrarily close to channel ca-

    pacity, on an arbitrary discrete-memoryless channel. Finally, we

    discuss the practical iterative decoding algorithms of the various

    code structures and demonstrate their effectiveness.

    Independently of our work, Erez and Miller [10] have re-

    cently examined the performance, under ML decoding, of stan-

    dard -ary LDPC codes over modulo-additive channels, in the

    context of lattice codes for additive channels. In this case, due

    to the inherent symmetry of modulo-additive channels, neithercosets nor quantization mapping are required.

    Our work is organized as follows. In Section II, we discuss

    BQC-LDPC codes; in Section III, we discuss MQC-LDPC

    codes; and in Section IV, we discuss GQC-LDPC codes.

    Section V discusses the iterative decoding of these structures. It

    also presents simulation results and a comparison with existing

    methods for transmission over the additive white Gaussian

    noise (AWGN) channel. Section VI concludes the paper.

    II. BINARY QUANTIZED COSET (BQC) CODES

    We begin with a formal definition of coset-ensembles. Our

    definition is slightly different from the one used by Gallager[13] and is more suitable for our analysis (see [26]).

    Definition 1: Let be an ensemble of equal block-length

    binary codes. The ensemble coset- (denoted ) is defined as

    the output of the following two steps.

    1. Generate the ensemble from by including, for each

    code in , codes generated by all possible permutations

    on the order of bits in the codewords.

    2. Generate ensemble from by including, for each code

    in , codes of the form ( de-

    notes bitwise modulo- addition) for all possible vectors

    .

    Given a code , we refer to as the coset vectorand

    as the underlying code.

    Note: The above definition is given in a general context.

    However, step 1 can be dropped when generating the coset-en-

    semble of LDPC codes, because LDPC code ensembles already

    contain all codes obtained from a permutation of the order of

    bits.

    We proceed to formally define the concept ofquantization.

    Definition 2: Let bea rationalprobability assignment of

    the form , defined over the set

    A quantization associated with , of length

    , is a mapping from the set of vectors tosuch that the number of vectors mapped to each

    is .

    Note that in Definition 2, the mapping of vectors to channel

    symbols is, in general, not one-to-one. Hence the name quanti-

    zation. Such mapping enables the achievement of nonuniform

    distributions , as required to approach the capacity of as-

    symetric channels. Applying the quantization to a vector of

    length is performed by breaking the vector into subvec-

    tors of bits each, and applying to each of the subvectors.

    That is,

    (1)

    The quantization of a code is the code obtained from applying

    the quantization to each of the codewords. Likewise, the quanti-

    zation of an ensemble is an ensemble obtained by applying the

    quantization to each of the ensembles codes. A BQC ensemble

    is obtained from a binary code-ensemble by applying a quanti-

    zation to the corresponding coset-ensemble.

    It is useful to model BQC-LDPC encoding as a sequence of

    operations, as shown in Fig. 1. An incoming message is encoded

    into a codeword of the underlying LDPC code . The vector is

    then added, and the quantization mapping applied. Finally, the

    resulting codeword is transmitted over the channel.We now introduce the following notation, which is a general-

    ization of similar notation taken from literature covering binary

    codes. For convenience, we formulate our results for the dis-

    crete-output case. The conversion to continuous output is im-

    mediate.

    (2)

    where denotes the transition probabilities of a given

    channel. Using the CauchySchwartz inequality, it is easy

    to verify that and for all . This last

    inequality becomes strict for nondegenerate quantizations andchannels, as explained in Appendix I-A. We assume these

    requirements of nondegeneracy throughout the paper.

    We now define

    (3)

    The analysis of the ML decoding properties of BQC codes is

    based on the following theorem, which relates the decoding

    error to the underlying codes spectrum.

    Theorem 1: Let be the transition probability assign-

    ment of a discrete memoryless channel with an input alphabet

    . Let be a rate below channel capacity (the

  • 7/27/2019 On the Application of LDPC Codes to Arbitrary Discrete-Memoryless Channels_IEEE 2004

    3/22

    BENNATAN AND BURSHTEIN: APPLICATION OF LDPC CODES TO DISCRETE-MEMORYLESS CHANNELS 419

    rate being measured in bits per channel use), an arbitrary

    distribution, and a quantization associated with of

    length . Let be an ensemble of linear binary codes of length

    and rate at least , and the ensemble average number

    of words of weight in . Let be the BQC ensemble corre-

    sponding to .

    Let be an arbitrary set of integers. The

    ensemble average error under ML decoding of , , satisfies

    (4)

    where is given by (3), is the random coding exponent

    as defined by Gallager in [13] for the input distribution

    (5)

    and is given by

    (6)

    , and .

    The proof of this theorem is similar to the proof of Theorem 3

    and to proofs provided in [26] and [21]. A sketch of the proof

    is given in Appendix I-B. The proof of Theorem 3 is given in

    detail in Appendix II-A.

    Theorem 1 is given in a context of general BQC codes. Wenow proceed to examine BQC-LDPC codes, constructed based

    on the ensemble of -regular binary LDPC codes. It is con-

    venient to define a -regular binary LDPC code of length

    by means of a bipartite -regular Tanner graph [27]. The

    graph has variable (left) nodes, corresponding to codeword

    bits, and check (right) nodes. A word is a codeword if at

    each check node, the modulo- sum of all the bits at its adjacent

    variable nodes is zero.

    The following method, due to Luby et al. [18], is used to de-

    fine the ensemble of -regular binary LDPC codes. Each

    variable node is assigned sockets, and each check node is as-

    signed sockets. The variable sockets are matched (that is,

    connected using graph edges) to ( ) check sockets bymeans of randomly selected permutation of . The

    ensemble of LDPC codesconsists ofall thecodes con-

    structed from all possible graph constellations.

    The mapping from a bipartitegraphto theparity-check matrix

    is performed by setting each matrix element , corresponding

    to the th check node and the th variable node, to the number

    of edges connecting the two nodes modulo- (this definition is

    designed to account for the rare occurrence of parallel edges

    between two nodes). The design rate of a regular LDPC

    code is defined as . This value is a

    lower bound on the true rate of the code, measured in bits per

    channel use.

    In [21], it is shown there exists such that the min-

    imum distance of a randomly selected code of a -reg-

    ular, -length, binary LDPC ensemble satisfies

    (7)

    The meaning of the above observation is that the performance of

    the ensemble is cluttered by a small number of codes (at worst).

    Removing these codes from the ensemble, we obtain an expur-

    gatedensemble with improved properties.

    Formally, given an ensemble of -regular LDPC codes

    of length and an arbitrary number , we define the

    expurgated ensemble as the ensemble obtained by removing

    all codes of minimum distance less than or equal to from

    . If and is large enough, this does not reduce the size

    of the ensemble by a factor greater than . Thus, the expurgated

    ensemble average spectrum satisfies

    (8)

    We refer to the BQC ensemble corresponding to an expur-

    gated -regular ensemble as an expurgated BQC-LDPC

    ensemble. We now use the above construction in the following

    theorem.

    Theorem 2: Let be the transition probability assign-

    ment of a discrete memoryless channel with an input alphabet

    . Let be some rational positive number, and let

    , , , and be defined as in Theorem 1 .L et

    be two arbitrary numbers. Then, for and large enough,

    there exists a -regular expurgated BQC-LDPC ensemble

    (containing all but a diminishingly small proportion of the

    -regular BQC-LDPC codes, as discussed above) of length

    and design rate bits per channel use (the rate of each code

    in the ensemble is at least ), satisfying , such

    that the ML decoding error probability over satisfies

    (9)

    The proof of this theorem relies on results from [21] and is

    provided in Appendix I-C.

    Gallager [13] defined the exponent to be the max-

    imum, for each , of , evaluated over all possible

    input distributions . Let be the distribution that

    attains this maximum for a given . By the nature of thequantization concept, we are restricted to input distributions

    that assign rational probabilities of the form to

    each . Nevertheless, by selecting

    to approximate , we obtain which approaches

    equivalently close. for all and,

    thus, by appropriately selecting (recalling that ),

    we obtain BQC-LDPC codes that are capable of reliable

    transmission (under ML decoding) at any rate below capacity.

    Furthermore, we have that the ensemble decoding error de-

    creases exponentially with . At rates approaching capacity,

    approaches zero and hence (9) is dominated by the

    random-coding error exponent.

  • 7/27/2019 On the Application of LDPC Codes to Arbitrary Discrete-Memoryless Channels_IEEE 2004

    4/22

    420 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 50, NO. 3, MARCH 2004

    Fig. 2. Upper bounds on the minimum required SNR for successful ML decoding of some BQC-LDPC codes over an AWGN channel. The solid line is theShannon limit.

    Fig. 2 compares our bound on the threshold of several BQC-

    LDPC code ensembles, as a function of the signal-to-noise ratio

    (SNR), over the AWGN channel. The quantizations used are

    uniformly spaced values ascending from

    to (10)

    The values of are listed in ascending binary order, e.g.,

    (for the case of ). The quantizations were chosen such

    that a mean power constraint of is satisfied. Note that quanti-

    zation corresponds to equal-spaced 32-PAM transmission, ef-

    fectively representing transmission without quantization. Thus,

    the gap between codes and illustrates

    the importance of quantization.

    To obtain the threshold we use Theorem 1, employing

    methods similar to the ones discussed in of [21, Sec. V]. For

    a given -regular code and distribution we seek the

    minimal SNR that satisfies the following (for simplicity weassume that is even, and, therefore, ). We first

    determine the maximal value of such that ,

    we then define ,

    and require that be positive (the maximum

    point yielding positive is evaluated using the mutual

    information as described in [13, Sec. 5.6]).

    III. MODULO- QUANTIZED COSET (MQC) LDPC CODES

    MQC-LDPC codes are based on modulo- LDPC codes. The

    definition of -regular, modulo- LDPC codes is adapted

    from its binary equivalent (provided in Section II) and is a

    slightly modified version of Gallagers definition [12, Ch. 5].Bipartite -regular graphs are defined and constructed in

    the same way as in Section II, although variable nodes are

    associated with -ary symbols rather than bits. A -ary word

    is a codeword if at each check node, the modulo- sum of

    all symbols at its adjacent variable nodes is zero. The mapping

    from a bipartite graph to the parity-check matrix is performed

    by setting each matrix element , corresponding to the th

    check node and the th variable node, to the number of edges

    connecting the two nodes modulo- (we thus occasionally

    obtain matrix elements of a nonbinary alphabet).

    As in the binary case, the design rate of a regular

    modulo- LDPC code is defined as .

  • 7/27/2019 On the Application of LDPC Codes to Arbitrary Discrete-Memoryless Channels_IEEE 2004

    5/22

    BENNATAN AND BURSHTEIN: APPLICATION OF LDPC CODES TO DISCRETE-MEMORYLESS CHANNELS 421

    Assuming that is prime, this value is a lower bound on the

    true rate of the code, measured in -ary symbols per channel

    use. The prime assumption does not pose a problem, as the

    theorems presented in this section generally require to be

    prime.

    We proceed by giving the formal definitions of coset-ensem-

    bles and quantization for arbitrary-alphabet codes. Note that the

    definitions used here are slightly different to the ones used withBQC codes.

    Definition 3: Let be an ensemble of equal block length

    codes over the alphabet . The ensemble coset-

    (denoted ) is generated from by adding, for each code in

    , codes of the form for all possible vectors

    .

    Definition 4: Let bea rationalprobability assignmentof

    the form , defined over the set

    . A quantization associated with is a

    mapping from a set to where

    the number of elements mapped to each is

    .

    A quantization is applied to a vector by applying the above

    mapping to each of its elements. As in the binary case, the quan-

    tization of a code is the code obtained from applying the quan-

    tization to each of the codewords. The quantization of an en-

    semble is an ensemble obtained by applying the quantization to

    each of the ensembles codes. As with BQC codes, a MQC en-

    semble is obtained from a modulo- code ensemble by applying

    a quantization to the corresponding coset-ensemble.

    Analysis of ML decoding of binary-based BQC codes is fo-

    cused around the weight distribution of codewords. With -ary-

    based MQC codes, weight is replaced by the concept of type

    (note that in [7, Sec. 12.1] the type is defined differently, as anormalized value).

    Definition 5: The type of a vector

    is a -dimensional vector of integers such that

    is the number of occurrences of the symbol in . We denote

    the set of all possible types by .

    The spectrum of a -ary code is defined in a manner similar

    to that of binary codes.

    Definition 6: The spectrum of a code is defined as

    , where is the number of words of type in .

    We now introduce the notation (not to

    be confused with the similar definition of in (2)) defined by

    (11)

    where denotes the transition probabilities of a given

    channel, and is a quantization. Using the CauchySchwartz

    inequality, it is easy to verify that and for all

    . As in the case of BQC codes, the last inequality becomes

    strict for nondegenerate quantizations and channels (defined as

    in Appendix I-A, replacing with ).

    Given a type , we define

    The -ary (uniform distribution) random coding ensemble is

    created by randomly selecting each codeword and each code-

    word symbol independently and with uniform probability. Theensemble average spectrum (average number of codewords of

    type ) of the random coding ensemble is given by

    where is the number of codewords in each code and is the

    codeword length.

    The importance of the random-coding spectrum is given by

    the following theorem.

    Theorem 3: Let be the transition probability assign-

    ment of a discrete memoryless channel with an input alphabet

    . Let be a rate below channel capacity (therate being measured in -ary symbols per channel use), an

    arbitrary distribution, and a quantization associated with

    over the alphabet . Let be an ensemble of

    linear modulo- codes of length and rate at least , and

    the ensemble average of the number of words of type in . Let

    be the MQC ensemble corresponding to .

    Let bea set oftypes.Thenthe ensembleaverageerror

    under ML decoding of , , satisfies

    (12)

    where is defined as in (11), is given by (5), and isgiven by

    (13)

    , where is the type of the all-zeros word, and

    .

    The proof of this theorem is provided in Appendix II-A.

    The normalized ensemble spectrum of -ary codes is de-

    fined in a manner similar to that of binary codes

    (14)

    where denotes a -dimensional vector of rational numbers sat-

    isfying . Throughout the rest of this paper, we

    adopt the convention thatthe baseof the function isalways .

    The normalized spectrum of the random coding ensemble is

    given by

    where denotesthe entropy function, and isthe code rate

    (in -ary symbols per channel use). The normalized spectrum of

    modulo- LDPC codes is given by the following theorem.

  • 7/27/2019 On the Application of LDPC Codes to Arbitrary Discrete-Memoryless Channels_IEEE 2004

    6/22

    422 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 50, NO. 3, MARCH 2004

    Fig. 3. J for = 0 : 2 .

    Theorem 4: The asymptotic normalized ensemble spectrum

    of a -regular modulo- LDPC code over the alphabet

    is given by

    (15)

    where , is given by

    , and is given by

    (16)

    The proof of this theorem is provided in Appendix II-C.

    We now show that the modulo- LDPC normalized spectrum

    is upper-bounded arbitrarily closely by the random-coding nor-malized spectrum.

    Theorem 5: Let be a prime number, let be an

    arbitrarily chosen number, and

    (17)

    Let be a given rational positive number and be the

    random-coding normalized spectrum corresponding to the rate

    . Then for any there exists a number such that

    (18)

    for all , and all , satisfying and .

    The proof of this theorem is provided in Appendix II-D.

    Fig. 3 presents the set for a ternary alphabet. The and

    axes represent variables with being implied by the

    relation . A triangle outlines the region of valid

    values for (that is, and ).

    Fig. 4 presents the normalized spectrums of a ternary -

    regular LDPC code (the upper surface) and of the ternary

    rate random-coding ensemble (the lower surface). The

    normalized spectra are plotted as functions of parameters

    , with .

    Theorem 5 provides uniform convergence only in a subset of

    the space of valid values of . This means that the maximum

    (see (13)), evaluated over all values of , may be large regardless

    of the values of and . The solution to this problem follows in

    lines analogous to the binary case. We begin with the following

    lemma.

    Lemma 1: Let be a word of weight ( nonzero ele-

    ments). We now bound the probability of being a codeword

    of a randomly selected code of the -regular modulo-

    ensemble .

    1) If then

    (19)

    where is the number of check nodes in the

    parity-check matrix of a -regular code and de-

    notes the largest integer smaller than or equal to .

    2) For all , assuming prime

    (20)

    where , is given by

    (21)

    and is some constant dependent on alone satisfying

    .

    The proof of this lemma is provided in Appendix II-E.

    We now build on this lemma to examine the probability of

    there being any low-weight words in a randomly selected code

    for an LDPC ensemble.

    Theorem 6: Let be fixed, , and assume

    is prime. Then there exists , dependent on and alone,

    such that

    (22)

    where is the minimum distance of a randomly selected

    -regular modulo- code of length .1

    The proof of this theorem is provided in Appendix II-F.

    Given an ensemble of -regular LDPC codes of length

    and an arbitrary number , we define the expurgated

    ensemble as the ensemble obtained by removing all codes of

    minimum distance less than or equal to from . If

    (where is given by Theorem 6) and is large enough, this

    does not reduce the size of the ensemble by a factor greater than

    . Thus, the expurgated ensemble average spectrum satisfies

    (23)

    where is the number of nonzero elements in a word of

    type . That is, . We refer to the

    MQC ensemble corresponding to an expurgated -regular

    ensemble as an expurgated MQC-LDPC ensemble.

    1In fact, it can be shown that fixing R = 1 0 c = d and letting c ; d ! 1 ,the minimum distance of a randomly selected code is, with high probability,lower-bounded by a value arbitrarily close to the VarshamovGilbert bound.

  • 7/27/2019 On the Application of LDPC Codes to Arbitrary Discrete-Memoryless Channels_IEEE 2004

    7/22

    BENNATAN AND BURSHTEIN: APPLICATION OF LDPC CODES TO DISCRETE-MEMORYLESS CHANNELS 423

    Fig. 4. Comparison of the normalized spectrums of the ternary ( 3 ; 6 ) -regular LDPC code ensemble and the ternary rate- 1 = 2 random-coding ensemble.

    Theorem 7: Let be the transition probability assign-

    ment of a discrete memoryless channel with an input alphabet

    . Assume is prime. Let be some rational

    positive number, and let , , and be defined as in

    Theorem 3. Let be two arbitrary numbers. Then

    for and large enough there exists a -regular expur-

    gated MQC-LDPC ensemble (containing all but a diminish-

    ingly small proportion of the -regular MQC-LDPC codes,

    as shown in Theorem 6) of length and design rate , satis-

    fying , such that the ML decoding error probability

    over satisfies

    (24)

    The proof of this theorem is provided in Appendix II-G.

    Applying the same arguments as with BQC-LDPC codes,

    we obtain that MQC-LDPC codes can be designed for reliable

    transmission (under ML decoding) at any rate below capacity.

    Furthermore, at rates approaching capacity, (24) is dominated

    by the random-coding error exponent.

    Producing bounds for individual MQC-LDPC code ensem-

    bles, similar to those provided in Figs. 2 and 5, is difficult due

    to the numerical complexity of evaluating (13), for large , over

    sets given by (17).

    Note: In our constructions, we have restricted ourselves to

    prime values of . In Appendix II-H we show that this restriction

    is necessary, at least for values of that are multiples of .

    IV. QUANTIZED COSET LDPC CODES OVER GF (GQC)

    GQC-LDPC codes are based on LDPC codes defined over

    GF . We therefore begin by extending the definition of LDPC

    codes to a finite field GF .2 In our definition of modulo-

    LDPC codes in Section III, the enlargement of the code al-

    phabetsize did notextend to the parity-checkmatrix. The parity-

    check matrix was designed to contain binary digits alone (with

    rare exceptions involving parallel edges). We shall see in Ap-

    pendix II-H that for nonprime , this construction results in

    codes that are bounded away from the random-coding spec-

    trum. The ideas presented in Appendix II-H are easily extended

    to ensembles over GF for . We therefore define

    the GF parity-check matrix differently, employing elements

    from the entire GF field.

    Bipartite -regular graphs for LDPC codes over GF

    are constructed in the same way as described in Sections II and

    III, with the following addition: At each edge , a random,

    uniformly distributed label GF is selected. A

    word is a codeword if at each check node the following

    equation holds:

    where is the set of variable nodes adjacent to .

    The mapping from the bipartite graph to the parity-check

    matrix proceeds as follows. Element in the matrix, cor-

    responding to the th check node and the th variable node, is

    set to the GF sum of all labels corresponding to edges

    connecting the two nodes. As before, the rate of each code is

    lower-bounded by the design rate -ary

    symbols per channel use.

    2Galois fields GF ( q ) exist forvalues ofq satisfying q = p where p is primeand m is an arbitrary positive integer. See [3].

  • 7/27/2019 On the Application of LDPC Codes to Arbitrary Discrete-Memoryless Channels_IEEE 2004

    8/22

    424 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 50, NO. 3, MARCH 2004

    Fig. 5. Upper bounds on the minimum required SNR for successful ML decoding of some GQC-LDPC codes over an AWGN channel. The solid line is theShannon limit.

    GF coset-ensembles and quantization mappings are de-

    fined in the same way as with modulo- codes. Thus, we obtain

    GF quantized coset (GQC) code-ensembles.

    As with MQC codes, analysis of ML decoding properties of

    GQC LDPC codes involves the types rather than the weights

    of codewords. Nevertheless, the analysis is simplified using the

    following lemma.

    Lemma 2: Let and be two equal-weight words.

    The probabilities of the words belonging to a randomly selected

    code from a GF , -regular ensemble satisfy

    The proof of this lemma relies on the following two observa-

    tions. First, by the symmetry of the construction of the LDPC

    ensemble, any reordering of a words symbols has no effect onthe probability of the word belonging to a randomly selected

    code. Second, if we replace a nonzero symbol by another ,

    we can match any code with a unique code such

    that the modified word belongs to if and only if the original

    word belongs to . The code is constructed by modifying

    the labels on the edges adjacent to the corresponding variable

    node using (evaluated over GF ).

    We have now obtained that the probability of a word be-

    longing to a randomly selected code is dependent on its weight

    alone, and not on its type. We use this fact in the following the-

    orem, to produce a convenient expression for the normalized

    spectrum of LDPC codes over GF .

    Theorem 8: The asymptotic normalized ensemble spectrum

    of a -regular LDPC code over GF is given by

    (25)

    where and is given by

    (26)

    Note that in (25) the parameter is implied by the relation

    and, therefore, the right-hand side of the equation

    is, in fact, a function of alone. The proof of the theorem is

    provided in Appendix III-A.

    Theorems 3, 5, 6, and Lemma 1 carry over from MQC-LDPC

    to GQC-LDPC codes, with minor modifications. In Theorem

    3, modulo- addition is replaced by addition over GF . In

    Theorem 5, the requirement is added and the definition

    of is replaced by

    (27)

    The proof is similar to the case of MQC-LDPC codes, setting

    in (25) to upper-bound . In

    Lemma 1, (20) is replaced by

  • 7/27/2019 On the Application of LDPC Codes to Arbitrary Discrete-Memoryless Channels_IEEE 2004

    9/22

    BENNATAN AND BURSHTEIN: APPLICATION OF LDPC CODES TO DISCRETE-MEMORYLESS CHANNELS 425

    where and is given by

    Finally, Theorem 6 carries over unchanged from MQC-LDPC

    codes.

    The definition of expurgated GQC-LDPC ensembles is iden-tical to the equivalent MQC-LDPC definition. We now state the

    main theorem of this section (which is similar to Theorem 7).

    Theorem 9: Let be the transition probability assign-

    ment of a discrete memoryless channel with an input alphabet

    . Suppose . Let be some rational positive

    number, and let , , and be defined as in Theorem

    3. Let be an arbitrary number. Then for and large

    enough there exists a -regular expurgated GQC-LDPC en-

    semble (containing all but a diminishingly small proportion

    of the -regular GQC-LDPC codes, as shown in Theorem

    6) of length and design rate , satisfying , such

    that the ML decoding error probability over satisfies

    (28)

    The proof of this theorem follows in the lines of Theorem 7,

    replacing (78) with

    (29)

    The difference between (28) and (24) results from the larger

    span of the set defined in (27) in comparison with the one

    defined in Theorem 5. Thus, we are able to define such that

    the first term in (12) disappears.

    Applying the same arguments as with BQC-LDPC codes and

    MQC-LDPC codes, we obtain that GQC-LDPC codes can bedesigned for reliable transmission (under ML decoding) at any

    rate below capacity. Furthermore, the above theorem guarantees

    that at any rate, the decoding error exponent for GQC-LDPC

    codes asymptotically approaches the random-coding exponent,

    thus outperforming our bounds for both the BQC-LDPC and

    MQC-LDPC ensembles.

    Fig. 5 compares several GQC-LDPC code ensembles over the

    AWGN channel. The quantizations used are the same as those

    used for Fig. 2 and are given by (10) (using the representation

    of elements of GF by -dimensional binary vectors [3]).

    To obtain these bounds, we again use methods similar to the

    ones discussed above for BQC-LDPC codes (here we seek the

    maximum such that for all having

    and then define as in (29)).

    V. ITERATIVE DECODING

    Our analysis so far has focused on the desirable properties of

    the proposed codes under optimal ML decoding. In this section,

    we demonstrate that the various codes show favorable perfor-

    mance also under practical iterative decoding.

    The BQC-LDPC belief propagation decoder is based on the

    well-known LDPC decoder, with differences involving the ad-

    dition ofsymbol nodes alongside variable and check nodes, de-

    rived from a factor graph representation by McEliece [20].

    Fig. 6. Diagram of the bipartite graph of a ( 2 ; 3 ) -regular BQC-LDPC codewith T = 3 .

    The decoding process attempts to recover , the codeword of

    the underlying LDPC code. Fig. 6 presents an example of the bi-

    partite graph for a -regular BQC-LDPC code with .

    Variable nodes and check nodes are defined in a manner similar

    to that of standard LDPC codes. Symbol nodes correspond to

    the symbols of the quantized codeword (1). Each symbol node

    is connected to the variable nodes that make up the subvector

    that is mapped to the symbol. Decoding consists of alternating

    rightbound and leftbound iterations. In a rightbound iteration,

    messages are sent from variable nodes to symbol nodes and

    check nodes. In a leftbound iteration, the opposite occurs. Un-

    like the standard LDPC belief-propagation decoder, the channel

    output resides at the symbol nodes, rather than at the variable

    nodes. The use of the coset vector is easily accounted for at the

    symbol nodes.

    The MQC-LDPC and GQC-LDPC belief propagation de-

    coders are modified versions of the belief propagation decoderintroduced by Gallager in [12] for arbitrary alphabet LDPC

    codes, employing -dimensional vector messages. The mod-

    ifications, easily implemented at the variable nodes, account

    for the addition of the coset vector at the transmitter and

    for the use of quantization. Efficient implementation of belief

    propagation decoding of arbitrary-alphabet LDPC codes is

    discussed by Davey and MacKay [8] and by Richardson and

    Urbanke [23]. The ideas in [8] are suggested in the context of

    LDPC codes over GF but also apply to codes employing

    modulo- arithmetic. The method discussed in [23] suggests

    using the discrete Fourier transform (DFT) (the -dimensional

    DFT for codes over GF ) to reduce complexity. This is

    particularly useful for codes defined over GF , since thenthe multiplications are eliminated.

    An important property of quantized coset LDPC codes (BQC,

    MQC, and GQC) is that for any given bipartite graph, the prob-

    ability of decoding error, averaged over all possible values of

    the coset vector , is independent of the transmitted codeword

    of the underlying LDPC code (as observed by Kav cic et al. [15]

    in the context of coset LDPC codes for ISI channels). This fa-

    cilitates the extension of the density evolution analysis method

    (see [23]) to quantized coset codes.

    BQC-LDPC density evolution evaluates the density of mes-

    sages originating from symbol nodes using Monte Carlo simu-

    lations. Density evolution for MQC- and GQC-LDPC codes is

  • 7/27/2019 On the Application of LDPC Codes to Arbitrary Discrete-Memoryless Channels_IEEE 2004

    10/22

    426 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 50, NO. 3, MARCH 2004

    complicated by to the exponential amount of memory required

    to store the probability densities of -dimensional messages. A

    possible solution to this problem is to rely on Monte Carlo sim-

    ulations to evolve the densities.

    A. Simulation Results for BQC-LDPC Codes

    In this paper, we have generally confined our discussion to

    LDPC codes based on regular graphs. Nevertheless, as with

    standard LDPC codes, the best BQC-LDPC codes under be-

    lief-propagation decoding are based on irregular graphs as in-

    troduced by Luby et al. [18]. In this section, we therefore focus

    on irregular codes.

    A major element in generating good BQC-LDPC codes is the

    design of good quantizations. In our analysis of ML decoding,

    we focused on the probability assignment associated with

    the quantizations. A good probability assignment for memory-

    less AWGN channels is generally designed to approximate a

    Gaussian distribution (see [30] and [31]). Iterative decoding,

    however, is sensitive to the particular mapping .

    A key observation in the design of BQC-LDPC quantizations

    is that the degree of error protection provided to different bits

    that are mapped to the same channel symbol is not identical.

    Useful figures of merit are the following values (adapted from

    [24, Sec. III.E], replacing the log-likelihood messages of [24]

    with plain likelihood messages):

    (30)

    where is a random variable corresponding to the leftbound

    message from the th position of a symbol node, assuming the

    transmitted symbol was zero, in the first iteration of belief prop-

    agation. It is easy to verify that the following relation holds:

    Quantizations rendering a low for a particular produce a

    stronger leftbound message at the corresponding position, at

    the first iteration. Given a particular distribution , different

    quantizations associated with produce different sets

    . In keeping with the experience available for standardLDPC codes, simulation results indicate that good quantizations

    favor an irregularset of , meaning that some values of

    the set are allowed to be larger than others. In the design of edge

    distributions, to further increase irregularity, it is useful to con-

    sider not only the fraction of edges of given left degree but

    rather , accounting for the left variable nodes symbol node

    position.

    In this paper, we have employed only rudimentary methods

    of designing BQC-LDPC edge distributions. Table I presents the

    edge distributions of a rate (measured in bits per channel use)

    BQC-LDPC code for the AWGN channel. The following length

    quantization was used with 8-PAM signaling .

    TABLE ILEFT EDGE DISTRIBUTION FOR A RATE- 2 BQC-LDPC CODE WITHIN 1.1 dB

    OF THE SHANNON LIMIT. THE RIGHT EDGE DISTRIBUTION IS GIVEN BY = 0 : 2 5 , = 0 : 7 5

    TABLE IILEFT EDGE DISTRIBUTION FOR A RATE-2 MQC-LDPC CODE WITHIN 0.9 dB

    OF THE SHANNON LIMIT. THE RIGHT EDGE DISTRIBUTION IS GIVENBY = 0 : 5 , = 0 : 5

    Applying the notation of (10), we obtain that a simple assign-ment of values in an ascendingorderrenders a good quantization

    The SNR threshold (determined by simulations) is 1.1 dB away

    from the Shannon limit (which is 11.76 dB at rate ) at a block

    length of 200 000 bits or 50000 symbols and a bit-

    error rate of about . Decoding typically requires 100150

    iterations to converge. Note that the ML decoding threshold of

    a random code, constructed using the distribution asso-

    ciated with , is 11.99 dB (evaluated using the mutual in-

    formation as described in [13, Sec. 5.6]). Thus, the gap to the

    random-coding limit is 0.87 dB. The edge distributions were de-

    signedbasedon a rate- binary LDPC codesuggested by[24],

    using trial-and-error to design while keeping the marginal

    distribution fixed.

    B. Simulation Results for MQC-LDPC Codes

    The MQC equivalent of (30) are the values given

    by (11). As with BQC-LDPC codes, MQC-LDPC codes favor

    quantizations rendering an irregular set (as indicated

    by simulation results).

    In [24] and [6], methods for the design of edge distributions

    are presented that use singleton error probabilities produced bydensity evolution and iterative application of linear program-

    ming. In this work, we have used a similar method, replacing

    the probabilities produced by density evolution with results of

    Monte Carlo simulations. An additional improvement was ob-

    tained by replacing singleton error probabilities by the func-

    tional ( denoting -ary entropy).

    Table II presents the edge distributions of a rate (mea-

    sured again in bits per channel use) MQC-LDPC code for the

    AWGN channel. The code alphabet size is , and -PAM

    signaling was used , with the quantization listed

    below. The values of are listed in ascending order, i.e.,

    . The quantization values were selected

  • 7/27/2019 On the Application of LDPC Codes to Arbitrary Discrete-Memoryless Channels_IEEE 2004

    11/22

    BENNATAN AND BURSHTEIN: APPLICATION OF LDPC CODES TO DISCRETE-MEMORYLESS CHANNELS 427

    TABLE IIILEFT EDGE DISTRIBUTION FOR A RATE-2 GQC-LDPC CODE WITHIN 0.65 dB

    OF THE SHANNON LIMIT. THE RIGHT EDGE DISTRIBUTION IS GIVENBY = 0 : 5 , = 0 : 5

    to approximate a Gaussian distribution, and their order was

    determined by trial-and-error

    At an SNR 0.9 dB away from the Shannon limit, the symbol

    error rate was (50 Monte Carlo simulations). The gap

    to the ML decoding threshold of a random code, constructed

    using the distribution associated with , is 0.7 dB.

    The block length used was 50 000 symbols or

    204 373 bits. Decoding typically required 100150 iterations to

    converge.

    C. Simulation Results for GQC-LDPC Codes

    In contrast to BQC-LDPC and MQC-LDPC codes, GQC-

    LDPC appear resilient to the ordering of values in the quantiza-

    tions used. This is the result of the use of random labels, which

    infer a permutation on the rightbound and leftbound messages

    at check nodes.

    Table III presents the edge distributions of a rate (mea-

    sured again in bits per channel use) GQC-LDPC code for the

    AWGN channel. The methods used to design the codes are the

    same as those described above for MQC-LDPC codes. The

    code alphabet size is , and -PAM signaling was used

    , with the quantization listed below. The values of

    are listed in ascending order (using the representation

    of elements of GF by -dimensional binary vectors

    [3]). The quantization values were selected to approximate a

    Gaussian distribution

    At an SNR 0.65 dB away from the Shannon limit, the symbol

    error rate was less than (100 Monte Carlo simulations).The gap to the ML decoding threshold of a random code, con-

    structed using the distribution associated with , is

    0.4 dB. The block length used was 50000 symbols or

    200 000 bits. Decoding typically required 100150 itera-

    tions to converge.

    D. Comparison With Existing Methods

    Wachsmann et al. [31], present reliable transmission using

    multilevel codes with turbo code constituent codes, at approx-

    imately 1 dB of the Shannon limit at rate 1 bit per dimension.

    Multilevel coding is similar to BQC quantization mapping, in

    that multiple bits (each from a separate binary code) are com-

    bined to produce channel symbols. However, the division of

    the code into independent subcodes and the hard decisions per-

    formed during multistage decoding, are potentially suboptimal.

    Robertson and Wrts [25], using turbo codes with TCM con-

    stituent codes, report several results that include reliable trans-

    mission within 0.8 dB of the Shannon limit at rate 1 bit per di-

    mension.Kavcic, Ma, Mitzenmacher, and Varnica [16], [19], and [29]

    present reliable transmission at 0.74 dB of the Shannon limit at

    rate 2.5 bits per dimension. Their method has several similar-

    ities to BQC-LDPC codes. As in multilevel schemes, multiple

    bits from the outer codes are combined and mapped into channel

    symbols in a manner similar to BQC quantizations. Quantiza-

    tion mapping can be viewed as a special case of a one-state

    MIR trellis code, having parallel branches connecting every

    two subsequent states of the trellis. Furthermore, the irregular

    LDPC construction method of [19] and [29] is similar to the

    method of Section V-A. In [19] and [29], the edge distribu-

    tions for the outer LDPC codes are optimized separately, and

    the resulting codes are interleaved into one LDPC code. Simi-

    larly, BQC-LDPC construction distinguishes between variable

    nodes of different symbol node position. However, during the

    construction of the BQC-LDPC bipartite graph, no distinction is

    made between variable nodes in the sense that all variable nodes

    are allowed to connect to the same check nodes (see Section II).

    This is in contrast to the interleaved LDPC codes of [ 19] and

    [29], where variable nodes originating from different subcodes

    are connected to distinct sets of check nodes.

    VI. CONCLUSION

    The codes presented in this paper provide a simple approach

    to the problem of applying LDPC codes to arbitrary discrete

    memoryless channels. Quantization mapping (based on ideas by

    Gallager [13] and McEliece [20]) has enabled the adaptation of

    LDPC codes to channels where the capacity-achieving source

    distribution is nonuniform. It is thus a valuable method of over-

    coming the shaping gap to capacity. The addition of a random

    coset vector (based on ideas by Gallager [13] and Kavcic et al.

    [15]) is a crucial ingredient for rigorous analysis.

    Our focus in this paper has been on ML decoding. We have

    shown that all three code configurations presented, under ML

    decoding, are capable of approaching capacity arbitrarily close.Our analysis of MQC and GQC codes has relied on generaliza-

    tions of concepts useful with binary codes, like code spectrum

    and expurgated ensembles.

    We have also demonstrated the performance of practical itera-

    tive decoding. The simple structure of the codes presented lends

    itself to the design of conceptually simple decoders, which are

    based entirely on the well-established concepts of iterative be-

    lief-propagation (e.g., [20], [22], [12]). We have presented sim-

    ulation results of promising performance within 0.65 dB of the

    Shannon limit at a rate of 2 bits per channel use. We are cur-

    rently working on an analysis of iterative decoding of quantized

    coset LDPC codes [1].

  • 7/27/2019 On the Application of LDPC Codes to Arbitrary Discrete-Memoryless Channels_IEEE 2004

    12/22

    428 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 50, NO. 3, MARCH 2004

    APPENDIX I

    PROOFS FOR SECTION II

    A. Proof of for for Nondegenerate

    Quantizations and Channels

    We define a quantization to be nondegenerate if there ex-

    ists no integer such that is an integer mul-

    tiple of for all (such a quantization could be replaced by asimpler quantization over an alphabet of size that would

    equally attain input distribution ). A channel is nondegen-

    erate if there exist no values such that

    for all . By inspecting when the

    CauchySchwartz inequality becomes an equality, we see that

    if and only if

    for all and . Hence, by the channel nondegeneracy assump-

    tion, , for all . This contradicts the quantiza-

    tion nondegeneracy assumption, since the number of elements

    in that are mapped to each channel symbol is some in-

    teger multiple of the additive order of (which is two).

    B. Sketch of Proof for Theorem 1The proofs of Theorems 1 and 3 are similar. In this paper, we

    have selected to elaborate on Theorem 3, as the gap between its

    proof and proofs provided in [26] and [21] is greater. Thus, we

    concentrate here only on the differences between the two.

    As in Appendix II-A, we define and (and equiv-

    alently and ) as

    was transmitted

    where is a code of ensemble , defined as in Definition 1. We

    obtain (using the union bound)

    (31)

    We proceed to bound both elements of the sum.

    1) As in Appendix II-A we obtain

    Recalling (1) and defining

    we obtain the following identity:

    Each element of the vector is now dependent only on

    the transmitted symbol atposition . We therefore

    obtain

    Using the above result, we now examine , the

    probability of error for a fixed parent code (obtained

    by Step 1 of Definition 1), averaged over all possible

    values of (see the first equation at the bottom of the

    page). and are defined in a manner similar to the

    definition of . Letting and

    we obtain (32) at the bottom of the page, whereis defined as in (2). Defining we

    observe that at least of the components

    are not equal to , and hence, recalling (3)

    where is the spectrum of code . Clearly, this bound

    can equally be applied to , abandoning the assump-

    tion of a fixed . The average spectrum of ensemble

    is equal to that of our original ensemble , and therefore

    we obtain

    (33)

    2) The methods used to bound the second element of (31)

    are similar to those used in Appendix II-A. The bound

    obtained is

    (34)

    Combining (31), (33), and (34) we obtain our desired re-

    sult of (4).

    (32)

  • 7/27/2019 On the Application of LDPC Codes to Arbitrary Discrete-Memoryless Channels_IEEE 2004

    13/22

    BENNATAN AND BURSHTEIN: APPLICATION OF LDPC CODES TO DISCRETE-MEMORYLESS CHANNELS 429

    C. Proof of Theorem 2

    Let be defined as in (7). Let be some arbitrary number

    smaller than that will be determined later. Let be the un-

    derlying expurgated LDPC ensemble of , obtained by expur-

    gating codes with minimum distance less than or equal to .

    We use Theorem 1 to bound .

    Defining

    we have (using (4))

    (35)

    We now examine both elements of the above sum.

    1) Given that all codes in the expurgated ensemble have min-

    imum distance greater than , we obtain that

    for all . We therefore turn to examine .

    A desirable effect of expurgation is that it also reduces

    the number of words having large . Assume and are

    two codewords of a code , such that their weight

    satisfies . The weight of the codeword

    satisfies , contradicting the con-

    struction of the expurgated ensemble. Hence, cannot

    contain more than one codeword having . Letting

    denote the spectrum of code , we have

    Clearly, this bound carries over to the average spectrum.

    Selecting so that we obtain

    (36)

    2) Examining the proof by [21, Appendix E], we obtain

    Therefore, there exist positive integers and such

    that for , , satisfying , , andthe following inequality holds:

    (37)

    Combining (35)(37) we obtain our desired result of (9).

    APPENDIX II

    PROOFS FOR SECTION III

    A. Proof of Theorem 3

    The proof of this theorem is a generalization of the proofs

    available in [21] and [26], from binary-input symmetric output

    to arbitrary channels.

    All codes in have the form for somecode of . We can therefore write

    where

    was transmitted

    We define and (and, equivalently, and )

    as

    type was transmitted

    (38)

    and obtain (using the union bound)

    (39)

    We now proceed to bound both elements of the sum.

    1) We first bound (defined above), the error proba-

    bility for a fixed index and code . This is the proba-

    bility thatsomecodeword such that type

    is more likely to have produced the observation thanthe true

    type was transmitted

    where is defined as shown in the equation at the

    bottom of the page. Therefore,

    otherwise.

  • 7/27/2019 On the Application of LDPC Codes to Arbitrary Discrete-Memoryless Channels_IEEE 2004

    14/22

    430 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 50, NO. 3, MARCH 2004

    Each code has the form for some

    parent code and a vector . We now use the aboveresult tobound , the probabilityof error fora fixed

    parent code , averaged over all possible values of

    The elements of the random vector are indepen-

    dent and identically distributed (i.i.d.) random variables. Therefore,

    Letting and we obtain

    where is defined as in (11). Letting

    type , we have

    Clearly, this bound can equally be applied to , aban-

    doning the assumption of a fixed . Finally, we bound

    (40)

    2) We now bound . For this we define a set of auxiliary

    ensembles.

    a) Ensemble is generated from by adding, for each

    code in , the codes generated by all possible

    permutations on the order of codewords within the

    code.

    b) Ensemble is generated from by adding, for

    each code in , the codes generated by all pos-

    sible permutations on theorder of symbols within

    the code.

    c) Ensemble is generated from by adding,

    for each code in , codes of the form

    for all possible vectors

    .

    Examining the above construction, it is easy to verify that

    any reordering of the steps has no impact on the finalresult. Therefore, ensemble can be obtained from by

    employing Steps 2a) and 2b).

    We also define ensemble as the quantization of .

    can equivalently be obtained employing the above Steps

    2a) and 2b) on ensemble . Therefore, , defined

    based on (38), is identical to a similarly defined ,

    evaluated over ensemble .

    We proceed to examine

    (41)

    where is defined as the probability, for a fixed

    and index , assuming , that a codeword of

    the form , where should

    be more likely than the transmitted . The random

    space here consists of a random selection of the code

    from . Therefore,

    type

    type

    Letting be an arbitrary number, we have by

    the union bound (extended as in [13])

    Employing Lemma 3, which is provided in Ap-

    pendix II-B, we obtain

  • 7/27/2019 On the Application of LDPC Codes to Arbitrary Discrete-Memoryless Channels_IEEE 2004

    15/22

    BENNATAN AND BURSHTEIN: APPLICATION OF LDPC CODES TO DISCRETE-MEMORYLESS CHANNELS 431

    Letting , we have

    where is some arbitrary number. Letting

    we have

    (42)

    Using (41) and (42) we obtain

    Employing Lemma 3 once more, we obtain

    The remainder of the proof follows in direct lines as in

    [13, Theorem 5.6.1]. Abandoning the assumption of a

    fixed , and recalling that , we obtain

    (43)

    Combining (39), (40), and (43) we obtain our desired result of

    (12).

    B. Statement and Proof of Lemma 3

    The following lemma is used in the proof of Theorem 3 and

    relies on the notation introduced there.

    Lemma 3: Let and let

    be two distinct indexes. Let and be the respective code-

    words of a randomly selected code of . Then

    1) .

    2) If type then

    (44)

    where is defined by (13).

    Proof: Let bea randomlyselected codefrom . We first

    fix the ancestor code and investigate the relevant probabili-

    ties. We define the following random variables: denotes the

    parent code from which was generated. Likewise, de-

    notes the parent of , and denotes the parent of .

    Let be an arbitrary nonzero word. We now examine the

    codewords of

    otherwise

    where is the number of codewords in . Recalling that

    is of rate at least , we have that and, hence,

    .

    We now examine

    where is the type of . The last equation holds because, given

    a uniform random selection of the symbol permutation , theprobability of being a codeword of is equal to the

    fraction of -type codewords within the entire set of -type

    words.

    Examining we have

    where is now the type of .

    We finally abandon the assumption of a fixed and obtain

    If is in , we obtain from (13) the desired result of (44), and

    thus complete the proof of the lemma.

    C. Proof of Theorem 4

    To prove this theorem, we first quote the following notation

    and theorem of Burshtein and Miller [4] (see also [14]). Given

    a multinomial , the coefficient of is de-

  • 7/27/2019 On the Application of LDPC Codes to Arbitrary Discrete-Memoryless Channels_IEEE 2004

    16/22

    432 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 50, NO. 3, MARCH 2004

    noted3 . The theorem is stated for the

    two-dimensional case but is equally valid for higher dimen-

    sions (we present a truncated version of the theorem actually

    discussed in [4]).

    Theorem 10: Let be some rational number and let

    be a function such that is a multinomial with

    nonnegative coefficients. Let and be some rational

    numbers and let be the series of all indexes such that

    is an integer and . Then

    (45)

    and

    (46)

    We now use this theorem to prove Theorem 4. The proof fol-

    lows the lines of a similar proof for binary codes in [4].

    We first calculate the asymptotic spectrum for satisfying

    for all .

    Given a type and

    let be an indicator random variable equal to if the th

    word of type (by some arbitrary ordering of type- words) is a

    codeword of a drawn code, and otherwise. Then

    Therefore,

    (47)

    the final equality resulting from the symmetry of the ensemble

    construction. Thus, for all

    (48)

    3The same notation b x c is used to denote the largest integer smaller than orequal to x . The distinction between the two meanings is to be made based onthe context of the discussion.

    We now examine . Let be a word of type .

    Each assignment of edges by a random permutation infers a

    symbol coloring on each of the check node sockets, using

    colors of type . The total number

    of colorings is

    (49)

    Given an assignment of graph edges, is a codeword if at each

    check node, the modulo- sum of all adjacent variable nodes is

    zero. The number of valid assignments (assignments rendering

    a codeword), is determined using an enumerating function

    (50)

    where is given by the first equation at the bottom of the

    page. We now have

    (51)

    Using (49) we have

    (52)

    Using (50) and Theorem 10 we have

    (53)

    Combining (48), (51), (52), and (53) we obtain

    To adapt the proof to having for some , we modify the

    expression for as follows. Assuming (without loss of

    generality) that for all , and for all ,

    we obtain

    where is given by the second equation at the

    bottom of the page. Following the line of previous development,

    we obtain

  • 7/27/2019 On the Application of LDPC Codes to Arbitrary Discrete-Memoryless Channels_IEEE 2004

    17/22

    BENNATAN AND BURSHTEIN: APPLICATION OF LDPC CODES TO DISCRETE-MEMORYLESS CHANNELS 433

    We now obtain (16). The development follows the lines of a

    similar development in [12, Theorem 5.1]. Given , we define

    for

    and

    (54)

    Therefore, we obtain the expression for and subsequently

    (55) (both at the bottom of the page). Combining (54) with (55)

    we obtain (16).

    D. Proof of Theorem 5

    From (15), we obtain, selecting

    (56)

    We now bound for ( being defined in (17)).

    We introduce the following standard notation, borrowed from

    literature covering the DFT:

    (57)

    (58)

    Incorporating this notation into (16), we now have

    (59)

    Let .

    Using we have that there exists at least one

    satisfying .

    Using , we obtain that there exists at least one

    satisfying .

    We now bound for all

    (60)

    For

    (61)

    We separately treat both elements of the above sum

    (62)

    Using , , and and using

    the fact that is prime, we obtain that and have no

    common divisors. Therefore, . Defining as

    (63)

    we have

    therefore,

    Taking the square root of both sides of the above equation, we

    obtain

    (64)

    where is some positive constant smaller than , dependent on

    and but independent of and .

    and

    (55)

  • 7/27/2019 On the Application of LDPC Codes to Arbitrary Discrete-Memoryless Channels_IEEE 2004

    18/22

    434 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 50, NO. 3, MARCH 2004

    Combining (61), (62), and (64) we obtain

    (65)

    where , like , is some positive constant smaller than , de-

    pendent on and but independent of and .

    Recalling (59)

    using (60) and (65) we obtain

    and, therefore, approaches with uniformly on . We

    now obtain

    where the above bound is obtained uniformly over . Fi-

    nally, combining this result with (56) we obtain our desired re-

    sult of (18).

    E. Proof of Lemma 1

    1) The proof of (19) is similar to the proof of Lemma 2 in[21]. We concentrate here only on the differences between

    the two, resulting from the enlargement of the alphabet

    size. The mapping between a bipartite graph and a matrix

    , defined in[21], is extended so that an element

    of the matrix is set to if the corresponding left vertex

    (variable node) is nonzero, and to otherwise. Therefore,

    although our alphabet size is in general larger than ,

    we still restrict the ensemble of matrices to binary

    values.

    Unlike [21], applying the above mapping to a valid

    -ary codeword does not necessarily produce a matrix

    whose columns are all of an even weight. Therefore, the

    requirement that be even does not hold. However, the

    matrix must still satisfy the condition that each of its

    populated columns have a weight of at least . Hence, the

    number of populated columns cannot exceed . The

    remainder of the proof follows in direct lines as in [21].

    2) We now turn to the proof of (20). Let be the type of

    . As we have seen in Appendix II-C, (49), and (50)

    (66)

    To bound the numerator, we use methods similar to those

    used to obtain (53), applying the bound (45) rather than

    the limit of (46). We obtain

    (67)

    To lower-bound the denominator, we use the well-known

    bound (e.g., [7, Theorem 12.1.3])

    (68)

    Combining (66), (67), and (68) we obtain

    (69)

    Using arguments similar to those used to obtain (56), we

    obtain

    (70)

    Recalling that is the type of word and given that

    is the number of nonzero elements in , we

    obtain . We now bound for values of

    satisfying , , and

    . To do this, we use methods similar to thoseused in Appendix II-D. Employing the notation of (57)

    and (58), we bound for

    We first examine the elements of the second sum on the

    right-hand side of the above equation. As in the proof

    of Theorem 5, we have for all .Defining as in (63), we obtain

    We now have

  • 7/27/2019 On the Application of LDPC Codes to Arbitrary Discrete-Memoryless Channels_IEEE 2004

    19/22

    BENNATAN AND BURSHTEIN: APPLICATION OF LDPC CODES TO DISCRETE-MEMORYLESS CHANNELS 435

    The first sum in the last equation is . Dropping elements

    from the second sum can only increase the overall result,

    and hence we obtain

    (71)

    Combining (59), (60), and (71), we obtain

    (72)

    Combining (72) with (70), recalling (21), we obtain our

    desired result of (20).

    F. Proof of Theorem 6

    The proof of this theorem follows in lines similar to those

    used in the proofs of Theorems 2 and 3 of [21].

    Using a union bound, we obtain

    (73)

    where and are determined later. We now proceed to bound

    both elements of the above sum.

    1) Requiring and invoking (19) we obtain

    (74)

    As in the proof of Theorem 2 in [ 21], we define

    Using methods similar to those in [21] we obtain

    Recalling and we obtain

    We now define

    The inner contents of the braces are positive for all and

    approach as . Therefore, the above value is

    positive. Recalling and we have

    We now return to (74) and obtain

    (75)

    2) We now examine values of satisfying .

    From (20) we have

    (76)

    We now restrict , and obtain, recalling (21) and

    using

    Using for all , we have

    The function ascends from zero inthe range . Defining

    we obtain that for all in the range

    We now define as some value in the range

    yielding . Using (76) we obtain

    Therefore,

    (77)

    Combining (73) with (75) and (77) we obtain our desired result

    of (22).

  • 7/27/2019 On the Application of LDPC Codes to Arbitrary Discrete-Memoryless Channels_IEEE 2004

    20/22

    436 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 50, NO. 3, MARCH 2004

    G. Proof of Theorem 7

    Let be defined as in Theorem 6. Let be some number

    smaller than that will be determined later. Let be the un-

    derlying expurgated LDPC ensemble of , obtained by expur-

    gating codes with minimum distance less than or equal to .

    We use Theorem 3 to bound .

    Assigning

    (78)

    we have (using (12))

    (79)

    We now examine both elements of the above sum.

    1) As in the proof of Theorem 2, we obtain that for all

    . We also obtain that for each ,cannot contain more than one codeword having .

    Examining for we have

    Letting denote the spectrum of code , we

    have

    Summing over all and taking the average spectrum weobtain

    Finally, selecting so that we obtain

    (80)

    2) can be written as

    where is defined as in (17). We now examine

    (81)

    Examining (23) it is easy to verify that the first element

    of the above sum approaches zero as . To bound

    the second element, we first rely on (47), (68), and (69),

    and obtain a finite- bound on

    Thus, the second element in (81), evaluated over all valid

    , is bounded arbitrarily close to zero as . The

    third element is upper-bounded using Theorem 5, for ,

    satisfying and . The fourth element

    approaches zero as by (68) (recalling

    ).

    Summarizing, there exist positive integers , such

    that for , , satisfying , , and

    the following inequality holds:

    (82)

    Combining (79), (80), and (82) we obtain our desired result of

    (24).

    H. Restriction to Prime Values of

    Let be a positive integer, and let be a nonprime

    number. We now show that there exist values of such that no

    exists as in Theorem 5. Moreover, expurgation is impossible

    for those values of .

    To show this, we examine the subgroup of the modulo-

    group, formed by . This subgroup is isomorphic to the

    group modulo- (the binary field). Given a modulo- LDPC

    code , we denote by the subcode produced by codewords

    containing symbols from alone. We now examine theensemble of subcodes .

    The binary asymptotic normalized spectrum is defined, for

    as

    The limiting properties of this spectrum are well known (see

    [17]). Fixing the ratio and letting ,

    approaches the value where

    is the binary entropy function.

  • 7/27/2019 On the Application of LDPC Codes to Arbitrary Discrete-Memoryless Channels_IEEE 2004

    21/22

    BENNATAN AND BURSHTEIN: APPLICATION OF LDPC CODES TO DISCRETE-MEMORYLESS CHANNELS 437

    We now examine values of belonging to the set defined

    by

    Given the above discussion, the normalized spectrum ofmodulo- LDPC, defined by (14), evaluated over , corre-

    sponds to . We therefore obtain that for

    We now show that expurgation is impossible. The probability

    of the existence of words of normalized type in , in a ran-

    domly selected code, corresponds to the probability that any bi-nary word satisfies the constraints imposed by the LDPC parity-

    check matrix. This results from the above discussed isomor-

    phism between and the modulo- group. This probability

    clearly does not approach zero (in fact, it is one), and, hence,

    expurgation of codes containing such words is impossible.

    APPENDIX III

    PROOFS FOR SECTION IV

    A. Proof of Theorem 8

    The proof of this theorem follows in the lines of the proof of

    Theorem 4.Equation (48) is carried over from the modulo- case. Let

    be a word of type . Recalling Lemma 2, we can assume

    without loss of generality that is a binary word of weight

    .

    Each assignment of edges and labels infers a symbol col-

    oring on each of the check node sockets (the th color corre-

    sponding to the th element of GF ), according to the adjacent

    variable nodes value multiplied by the label of the connecting

    edge. Exactly of the sockets are assigned colors of the set

    . All color assignments satisfying the above re-

    quirement are equally probable. The total number of colorings

    is

    (83)

    Given an assignment of graph edges, is a codeword if at each

    check node, the GF sum of the values of its sockets is zero.

    The number of valid assignments is determined using the enu-

    merating function

    (84)

    where is the weight-enumerating function of -length

    words over GF satisfying the requirement that the sum of

    all symbols must equal over GF .

    Using (48), (51) (with , and replaced by

    and ), (83) and (84), and applying Theorem 10

    as in the proof of Theorem 4, we arrive at (25). We now show

    that is given by (26).

    We first examine , defined such that the co-

    efficient of is the number of words of type

    (the index corresponds to the th element of

    GF ) whose sum over GF is zero. A useful expression foris

    (85)

    Elementsof GF can bemodeled as -dimensional vectors

    over (see [3]). The sum of two GF elements

    corresponds to the sum of the corresponding vectors, evaluated

    as the modulo- sum of each of the vector components. Thus,

    adding elements overGF is equivalent to adding -di-

    mensional vectors. Letting

    we have the following expression for :

    Consider, for example, the simple case of and .

    To simplify our notations, we denote by and by

    where and are evaluated modulo- . This last equa-

    tion is clearly the output of the two-dimensional cyclic convo-

    lution of with itself, evaluated at zero. In the general case

    (85), we have the -dimensional cyclic convolution of the func-

    tion with itself times, evaluated at zero.

    Using the -dimensional DFT of size to evaluate the convo-

    lution, we obtain

    IDFT DFT

    We now examine in (86) at the top of

    the following page. Consider the elements

  • 7/27/2019 On the Application of LDPC Codes to Arbitrary Discrete-Memoryless Channels_IEEE 2004

    22/22

    438 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 50, NO. 3, MARCH 2004

    (86)

    These elements correspond to the DFT of the function

    , which is a delta function. Thus, we obtain

    Combining the above with (86) we obtain our desired result

    of (26).

    ACKNOWLEDGMENT

    The authors would like to thank the anonymous reviewers for

    their valuable comments.

    REFERENCES

    [1] A. Bennatan and D. Burshtein, Iterative decoding of LDPC codes overarbitrary discrete-memoryless channels, in preparation.

    [2] C. Berrou, A. Glavieux, and P. Thitimajshima, NearShannonlimit errorcorrecting coding and decoding: Turbo codes, in Proc. 1993 IEEE Int.Conf. Communications, Geneva, Switzerland, 1993, pp. 10641070.

    [3] R. E. Blahut, Theory and Practice of Error Control Codes. Reading,MA: Addison-Wesley, 1984.

    [4] D. Burshtein and G. Miller, Asymptotic enumeration methods for ana-lyzing LDPC codes, IEEE Trans. Inform. Theory. [Online]: Availableat http://www.eng.tau.ac.il/~burstyn/AsymptEnum.ps, to be published.

    [5] G. Caire, D. Burshtein, and S. Shamai, LDPC coding for interferencemitigation at the transmitter, in Proc. 40th Annu. Allerton Conf. Com-munication, Control and Computing, Monticello, IL, Oct. 2002.

    [6] S.-Y. Chung, J. G. D. Forney, T. Richardson, andR. Urbanke, Onthe de-signof low-densityparity-check codeswithin 0.0045 dB of the Shannonlimit, IEEE Commun. Lett., vol. 5, pp. 5860, Feb. 2001.

    [7] T. M. Cover and J. A. Thomas, Elements of Information Theory. NewYork: Wiley, 1991.

    [8] M. C. Davey and D. MacKay, Low-density parity check codes overGF ( q ) , IEEE Comm. Letters, vol. 2, pp. 165167, June 1998.

    [9] D. Divsalar and F. Pollara, Turbo trellis coded modulation with itera-tive decoding for mobile satellite communications, in Proc. Int. MobileSatellite Conf., June 1997.

    [10] U. Erez andG. Miller, TheML decoding performance of LDPC ensem-bles over Z , IEEE Trans. Inform. Theory, submitted for publication.

    [11] G. D. Forney, Jr and G. Ungerboeck, Modulation and coding forlinear Gaussian channels, IEEE Trans. Inform. Theory, vol. 44, pp.

    23842415, Oct. 1998.[12] R. G. Gallager, Low Density Parity Check Codes. Cambridge, MA:MIT Press, 1963.

    [13] , Information Theory and Reliable Communication. New York:Wiley, 1968.

    [14] I.J. Good,Saddle point methods for the multinomial distribution,Ann.Math. Statist., vol. 28, pp. 861881, 1957.

    [15] A. Kavcic, X. Ma, and M. Mitzenmacher, Binary intersymbol inter-ference channels: Gallager codes, density evolution, and code perfor-mance bounds, IEEE Trans. Inform. Theory, vol. 49, pp. 16361652,July 2003.

    [16] A. Kavcic, X. Ma, M. Mitzenmacher, and N. Varnica, Capacity ap-proaching signal constellations for channels with memory, in Proc.

    Allerton Conf., Monticello, IL, Oct. 2001, pp. 311320.[17] S. Litsyn and V. Shevelev, On ensembles of low-density parity-check

    codes: Asymptotic distance distributions, IEEE Trans. Inform. Theory,vol. 48, pp. 887908, Apr. 2002.

    [18] M. G. Luby, M. Mitzenmacher, M. A. Shokrollahi, and D. A. Spielman,

    Improved low-density parity-check codes using irregular graphs,IEEE Trans. Inform. Theory, vol. 47, pp. 585598, Feb. 2001.

    [19] X. Ma, N. Varnica, and A. Kavcic, Matched information rate codes forbinary ISI channels, in Proc. 2002IEEE Int. Symp. Information Theory,Lausanne, Switzerland, June/July 2002, p. 269.

    [20] R. J. McEliece, Are turbo-like codes effective on nonstandard chan-nels?, IEEE Inform. Theory Soc. Newsletter, vol. 51, pp. 18, Dec.2001.

    [21] G. Miller and D. Burshtein, Bounds on the maximum-likelihooddecoding error probability of low-density parity-check codes, IEEETrans. Inform. Theory, vol. 47, pp. 26962710, Nov. 2001.

    [22] J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks ofPlausible Inference. San Francisco, CA: Morgan Kaufmann, 1988.

    [23] T. Richardson and R. Urbanke, The capacity of low-density paritycheck codes under message-passing decoding, IEEE Trans. Inform.Theory, vol. 47, pp. 599618, Feb. 2001.

    [24] T. Richardson, A. Shokrollahi, and R. Urbanke, Design of capacity-ap-

    proaching irregular low-density parity-check codes, IEEE Trans. In-form. Theory, vol. 47, pp. 619637, Feb. 2001.

    [25] P. Robertson and T. Wrts, Bandwidth-efficient turbo trellis-codedmodulation using punctured component codes, IEEE J. Select. AreasCommun., vol. 16, pp. 206218, Feb. 1998.

    [26] N. Shulman and M. Feder, Random coding techniques for nonrandomcodes,IEEE Trans. Inform. Theory, vol.45, pp.21012104, Sept. 1999.

    [27] R. M. Tanner, A recursive approach to low complexity codes, IEEETrans. Inform. Theory, vol. IT-27, pp. 533547, Sept. 1981.

    [28] S. ten Brink, G. Kramer, and A. Ashikhmin, Design of low-densityparity-check codes for multi-antenna modulation and detection, IEEETrans. Inform. Theory, submitted for publication.

    [29] N. Varnica, X. Ma, and A. Kavcic, Iteratively decodable codes forbridging the shaping gap in communications channels, in Proc.

    Asilomar Conf. Signals, Systems and Computers, Pacific Grove, CA,Nov. 2002.

    [30] , Capacity of power constrained memoryless AWGN channels

    with fixed input constellations, in Proc. IEEE Global Telecomm. Conf.(GLOBECOM), Taipei, Taiwan, Nov. 2002.[31] U. Wachsmann, R. F. Fischer, and J. B. Huber, Multilevel codes: Theo-

    retical concepts and practicaldesign rules,IEEE Trans. Inform. Theory,vol. 45, pp. 13611391, July 1999.