Decoding Turbo Codes and LDPC Codes via Linear ...people.csail.mit.edu/jonfeld/pubs/FeldmanLPDecodingTalk.pdfnegative cost [FKMT ’98]. - TBT: need LP along lines of [FK, FOCS ’02]

Decoding Turbo Codes and LDPCCodes via Linear Programming

Jon Feldman

[email protected]

MIT LCS

David Karger

[email protected]

MIT LCSMartin Wainwright

[email protected]

UC Berkeley

J. Feldman, Decoding Turbo Codes and LDPC Codes via Linear Programming – p.1/22

Binary Error-Correcting Code

code word

Encoder

110011101001

010011

010011

information wordNoisy

Channel

corrupt code word

Decoder

decoded info decoded code word

110011101001 11001 1010 10 1

PSfrag replacements

� �

� ��

� � � � �

� � � �

�

�

� � � � �

� � � � � �

� � � �

�

� Binary Symmetric Channel (BSC): each bit flippedindependently with probability � (small constant).


Turbo Codes + LDPC Codes

� Low-Density Parity-Check (LDPC) codes [Gal ’62] .

� Turbo Codes introduced [BGT ’93], unprecedentederror-correcting performance.

� Ensuing LDPC “Renaissance” [SS ’94, MN ’95,Wib ’96, MMC ’98, Yed ’02, ... ].

� Simple encoder, “belief-propagation” decoder.

� Theoretical understanding of good performance:- “Threshold” as � [LMSS ’01, RU ’01];- Decoder unpredictable with cycles.

� Finite-length analysis: combinatorial errorconditions known only for the binary erasurechannel [DPRTU ’02].


Our contributions[FK, FOCS ’02] [FKW, Allerton ’02] [FKW, CISS ’03]

� Poly-time decoder using LP relaxation.

� Decodes: binary linear codes LDPC codesturbo codes.

� “Pseudocodewords:” exact characterization of errorpatterns causing failure.

� “Fractional distance”

�

:- LP decoding corrects up to

� ��

errors.- Computable efficiently for turbo, LDPC codes.

� Error rate bounds based on high-girth graphs.

� Closely related to iterative approaches, other notionsof “pseudocodewords.”


Outline

� Error correcting codes.

� Using LP relaxation for decoding.

� Details of LP relaxation for binary linear codes.

� Pseudocodewords.

� Fractional Distance.

� Girth-based bounds.


Maximum-Likelihood Decoding

� Code ��

�

� ��

.

� Cost function � : negative log-likelihood ratio of � .

� BSC:

�

if

� � � , � ��

if� � � .

� Other channels: takes on arbitrary “soft values.”

Given: Corrupt code word

� �.Find: � � such that � is minimized.

� Linear Programming formulation:- Variables � for each code bit,

� � � .- Linear Program:

� ��

� ��

� � �

� ��


Linear Programming Relaxation

� Polytope : relaxation,

� ��

� � �

.

� Decoder: Solve LP using simplex/ellipsoid. If

� � � ��

� ��

, output ��

, else output “error.”

� ML certificate property: all outputs ML codewords.

� Want low word error rate (WER) :=

�� [ � � ].

noisyno noisePSfrag replacements

� 000

101 110

011

� �

� � � � : � � .� No noise: � optimal.

� Noise: perturbation ofobjective function.

� Design code, relaxationaccordingly.


Tanner Graph

� The Tanner Graph of a linear code is a bipartitegraph modeling the parity check matrix of the code.

PSfrag replacements

��

��

��

�

� � � �

� �

� “Variable nodes” � �� .

� “Check Nodes” � �� .

� �� : n’hood of check �� .

� Code words: � ��

�

� � �

s.t.:� ��

��

� Codewords: 0000000, 1110000, 1011001, etc.


IP/LP Formulation of ML Decoding

� Variables

� � for each code bit � .

IP: ��

�

� �

. LP:

� � .

� For check bit �� , � = valid configurations of

� �

.

� ��

� Variables

� � ��

for each check node � � ,

� � � .

IP: � ��

�

� �. LP:

� � ��

�

.PSfrag replacements

��

� �

� Vars: � � � , � � � � � �� , � � � � � � � ,� � � � � � � , � � � � � � � , � � � �� , � � � �� ,� � � � � � �


IP/LP Formulation of ML Decoding

� Minimize � , subject to:

��

��

�

� � ��

�

� �

� ��

� ��

� Let be the relaxed polytope.

� � � � � � ��

� ��

� ��

� IP: formulation of ML decoding.

� What do fractional solutions look like?


Fractional Solutions

PSfrag replacements

��

��

��

�

� � � �

� �

� Suppose: � ��

��

� � ��

��

� � �

� ML codeword:� �

�

��

��

��

��

��

� �

� ML codeword cost: ��

.——————————————

PSfrag replacements

� � ��

��

�

� Frac. sol: � �

��

��

��

��

��

��

��

.

� � � � � � � � � � � � � � � �

�

� ��

�

� � � � � � � � � �

�

� Frac. sol cost: ��

��

.J. Feldman, Decoding Turbo Codes and LDPC Codes via Linear Programming – p.11/22

LP Decoding Success Conditions

� Pr[ Decoding Success ] = Pr[ � is the unique OPT ].

� Assume � � �

- Common asssumption for linear codes.- OK in this case due to symmetry of polytope.

� Pr[ � is the unique OPT ]= Pr[ All other solutionss have cost > 0].

Theorem [FKW, CISS ’03]: Assume the all-zeros codeword was sent. Then, the LP de-codes correctly if̃f all non-zero points in P havepositive cost.


Pseudocodewords

� Pseudocodewords are scaled points in .

PSfrag replacements

��

��

� ��

� �

� �

� Previous example:

� ��

��

��

��

��

��

��

.

� Scaled to integers:

� ��

��

��

��

��

��

� �

.

� Natural combinatorial definition of pseudocodeword(independent of LP relaxation).

Theorem [FKW, CISS ’03]: LP decodes correctlyif̃f all pseudocodewords have cost

�

.


Fractional Distance

� Classical distance:-

�

= min Hamming dist. of codewords in .

� Adversarial performance bound:- ML decoding can correct

� �� errors.

� Another way to define minimum distance:-

�� = min (

� � ) dist. between two integral verts of .

� Fractional distance:-

�� = min (

� � ) dist. between an integral and afractional vertex of .

-

�� = min wt. fractional vertex of .

- Lower bound on classical distance:

��

�

.- LP Decoding can correct

��

� � � � errors.J. Feldman, Decoding Turbo Codes and LDPC Codes via Linear Programming – p.14/22

LP Decoding corrects ��

errors

� Suppose fewer than

��

��

errors occur.

� Let

� ��

�

be a vertex of ,

� � � �.

�� .

� When � � �

, ��

if

�

flipped, +1 o.w.; So,

�

� � � � ��

�

�

� � � � ��

� Since

� ��

� ��

� � � � ��

� � ��

�

� Therefore

� � � .


Computing the Fractional Distance

� Computing

�

for linear/LDPC codes is NP-hard.

� If the polytope has small size (LDPC), the fractionaldistance is easily computed.- More general problem: Given an LP, find the two

best vertices ��

.- Algorithm:

* Find �.* Guess the facet on which �

�

sits but � does not.* Set facet to equality, obtaining

�

.* Minimize �

� �over

�

.

� Good approximation to the classical distance?

� Good prediction of relative classical distance?


Using Girth for Error Bounds

� For rate-1/2 RA (cycle) codes: If has large girth,neg-cost pseudocodewords (promenades) are rare.

� Erdös (or [BMMS ’02]): Hamiltonian 3-regulargraph with girth

� �� .

Theorem [FK, FOCS ’02]: For any �

�

, aslong as

�� , WER �

� �

.

� Arbitrary , girth �, all var. nodes have degree

�

:

Theorem [FKW, CISS ’03]: �

� � � ��

� Can achieve�� . Stronger graph

properties (expansion?) are needed for strongerresults.


Other “pseudocodewords”

� BEC: Iterative decoding successful iff no zero-cost“stopping sets.” [DPRTU ’02]- In the BEC, pseudocodewords = stopping sets.- Iterative/LP decoding: same performance in BEC.

� Tail-Biting trellisses (TBT): Iterative decodingsuccessful iff “dominant pseudocodeword” hasnegative cost [FKMT ’98].- TBT: need LP along lines of [FK, FOCS ’02].- Iterative/LP decoding: same performance on TBT.

� “Min-sum” decoding successful iff no neg-cost“deviation sets” in the computation tree [Wib ’96].- Pseudocodewords are natural “closed” analog of

deviation sets.J. Feldman, Decoding Turbo Codes and LDPC Codes via Linear Programming – p.18/22

Other Results

� For “high-density” binary linear codes, needrepresentation of without exponential dependenceon check node degree.- Use “parity polytope” of Yannakakis [’91].

- Orig. representation:

�� .

- Using parity polytopes:� � � � � ��

��

�

.

� New iterative methods [FKW, Allerton ’02]:- Iterative “tree-reweighted max-product” [WJW

’02] tries to solve dual of our LP.- Subgradient method for solving LP gives

provably convergent iterative algorithm.

� Experiments on performance, distance bounds.


Performance Comparison

10-5

10-4

10-3

10-2

10-1

100

10-2.410-2.210-210-1.810-1.610-1.410-1.210-110-0.8

Wor

d E

rror

Rat

e

BSC Crossover Probability

Random rate-1/2 (3,6) LDPC Code

Min-Sum Decoder (100 iterations)LP DecoderBoth Error

� Length 200, left degree

�

, right degree

�


Growth of Average Fractional Distance

5

6

7

8

9

10

0 50 100 150 200 250 300 350 400 450 500

Ave

rage

Fra

ctio

nal D

ista

nce

Code Length

Rate 1/4 Gallager Ensemble Fractional Distance

� “Gallager” distribution, left degree

�

, right degree

�


Future Work

� New WER, fractional distance bounds:- Lower rate turbo codes (rate-1/3 RA).- Other LDPC codes, including

* Expander codes, irregular LDPC codes, otherconstructible families.

- Random LDPC, linear codes?

� ML Decoding using IP, branch-and-bound?

� Using generic “lifting” procedures to tightenrelaxation?

� Deeper connections to “sum-product”belief-propagation?

� LP decoding of other code families, channel models?J. Feldman, Decoding Turbo Codes and LDPC Codes via Linear Programming – p.22/22

Binary Error-Correcting CodeTurbo Codes + LDPC CodesOur contributionsOutlineMaximum-Likelihood DecodingLinear Programming RelaxationTanner GraphIP/LP Formulation of ML DecodingIP/LP Formulation of ML DecodingFractional SolutionsLP Decoding Success ConditionsPseudocodewordsFractional DistanceLP Decoding corrects $delta _f/2-1$ errorsComputing the Fractional DistanceUsing Girth for Error BoundsOther ``pseudocodewords''Other ResultsPerformance ComparisonGrowth of Average Fractional DistanceFuture Work

Documents

Decoding Turbo Codes and LDPC Codes via Linear ...people.csail.mit.edu/jonfeld/pubs/FeldmanLPDecodingTalk.pdfnegative cost [FKMT ’98]. - TBT: need LP along lines of [FK, FOCS ’02]