61
ELEC 619A Selected Topics in Digital Communications: Information Theory Error Correcting Codes II What is Possible?

ELEC 619A Selected Topics in Digital Communications ...agullive/coding3.pdf · There are simple encoding/decoding procedures for linear codes. is {1 1 1} ... There are many classes

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

  • ELEC 619ASelected Topics in Digital Communications:

    Information Theory

    Error Correcting Codes II

    What is Possible?

  • Notation and Examples

    An (n,k,d) - code C is a linear code such that• n - is the length of the codewords

    • k - is the number of data symbols in a codeword

    • d - is the minimum distance in C

    Examples:C1 = {00, 01, 10, 11} is a (2,2,1)-code.

    C2 = {000, 011, 101, 110} is a (3,2,2)-code.

    C3 = {00000, 11100, 00111, 11011} is a (5,2,3)-code.

    A good (n,k,d) code has small n and large k and d.

    2

  • (5,2,3) Code

    • k x n Generator matrix

    3

  • 4

  • Advantages of Linear Block Codes1. The minimum distance h(C) is easy to compute if C is a linear code.

    2. Linear codes have simple specifications.

    • To specify a non-linear code usually all codewords have to be listed.

    • To specify a linear [n,k] -code it is enough to list k linearly independent codewords.

    Definition A k n matrix whose rows form a basis for a linear [n,k]-code (subspace) C is said to be a generator matrix for C.

    Example A generator matrix for the code

    and for the code

    3. There are simple encoding/decoding procedures for linear codes.

    }111{ is 111

    000

    C

    5

  • Important Linear Block Codes

    There are many classes of practical linear block codes:

    • Hamming codes

    • Cyclic codes

    • BCH codes

    • Reed-Solomon codes

    • LDPC codes

    • Turbo codes

    • …

    6

  • Equivalence of Linear CodesDefinition Two linear codes are called equivalent if one can be obtained from the other by the following operations:

    (a) permutation of the positions of the code;

    (b) multiplication of symbols in a fixed position by a non-zero scalar.

    Theorem Two k n matrices generate equivalent linear (n,k) codes if one matrix can be obtained from the other by a sequence of the following operations:

    (a) permutation of the rows

    (b) multiplication of a row by a non-zero scalar

    (c) addition of one row to another

    (d) permutation of columns

    (e) multiplication of a column by a non-zero scalar

    Proof Operations (a) - (c) just replace one basis by another. The last two operations convert a generator matrix to one of an equivalent code.

    7

  • Equivalent Codes

    Distances between codewords are unchanged by equivalence operations (a) - (b). Consequently, equivalent linear codes have the same parameters (n,k,d) (and correct the same number of errors).

    8

  • Systematic Codes• For a systematic block code the dataword appears

    unaltered in the codeword – usually at the start

    • The generator matrix has the structure

    P | I

    ..1..00

    ................

    ..0..10

    ..0..01

    G

    21

    22221

    11211

    kknkk

    kn

    kn

    ppp

    ppp

    ppp

    k n-k

    • P is often referred to as parity matrix

    9

  • Systematic CodesExample

    ][

    1110001

    1100010

    1010100

    0111000

    4 PIG

    0101011010 1011

    1000111100 0111

    mmGm

    mmGm

    k n - k

    information digits redundancy(parity-check digits)

    10

  • Systematic Codes

    Theorem Let G be a generator matrix of an [n,k] -code. Then by

    operations (a) - (e) the matrix G can be transformed into the form

    [ Ik | A ]

    where Ik is the k k identity matrix, and A is a k (n - k) matrix.

    Example

    11

  • Encoding with a Systematic Code

    • vectormatrix multiplication

    • Encoding of a message m = (m1, … ,mk) with C

    Example Let C be a (7,4) code with generator matrix

    12

    | ( | )kc m G m I P m b

  • Example

    A message (m1, m2, m3, m4) is encoded as:

    (0 0 0 0) is encoded as ………………………….. ?

    (1 0 0 0) is encoded as ………………………….. ?

    (1 1 1 0) is encoded as ………………………….. ?

    13

  • Dual Codes

    Example

    ^

    Theorem Suppose C is a linear (n,k) code, then the dual code C^ is a linear (n,n-k) code.

    111

    000 then

    101

    110

    011

    000

    66

    CC

    14

  • Dual Codes

    For the (n,1) repetition code C with generator matrix

    G = [1,1, … ,1]

    the dual code C^ is an (n,n-1) SPC code with generator matrix G^

    described by

    ^

    The rows of G are orthogonal to the rows of G^

    1...0001

    ..

    0...0101

    0...0011

    G

    15

  • Parity Check MatricesDefinition A parity-check matrix H for an (n,k) code C is a generator matrix of C^.

    Theorem If H is a parity-check matrix of C, then

    C = {x V(n,2) | xHT = 0}

    and therefore any linear code is completely specified by a parity-check matrix.

    Example Parity-check matrix for

    The rows of a parity check matrix are parity checks on codewords. They specify that certain linear combinations of the coordinates of every codeword are zero.

    16

    6 is 1 1 1C

  • Systematic Parity Check Matrices

    If G = (Ik | P) is the standard form (systematic) generator matrix of a binary (n,k) code C, then a parity check matrix for C is H = (PT | In-k)

    Example

    1011

    1110

    0111

    110

    011

    111

    101

    34 IHIG

    17

  • Decoding Linear Codes• One possibility is a ROM look-up table

    • In this case the received word r is used as an address

    • Example – Even single parity check codeAddress Entry

    000000 0

    000001 1

    000010 1

    000011 0

    ……… .

    • The output is an error flag, i.e., 0 – codeword valid

    • If no error, the dataword is the first k codeword bits

    • For an error correcting code, the ROM can also store datawords after correction

    18

  • Standard Array

    • The Standard Array is constructed as follows

    c1 (all zero)

    e1

    e2

    e3

    ….

    en-k

    c2+e1

    c2+e2

    c2+e3

    ……

    c2+en-k

    c2

    cM-1+e1

    cM-1+e2

    cM-1+e3

    ……

    cM-1+en-k

    cM-1

    ……

    ……

    ……

    ……

    ……

    …… cM

    cM+e1

    cM+e2

    CM+e3

    …..

    cM+en-k

    • The array has 2k columns (equal to the number of valid codewords) and 2n-k rows (the number of correctable error patterns)

    19

  • Standard Array

    • The standard array is formed by initially choosing ei to be– All 1 bit error patterns

    – All 2 bit error patterns

    ……

    • Ensure that each new error pattern is not already in the array

    • All correctable error patterns will appear as coset leaders

    20

  • Standard Array Decoding

    • Assume that the received word is r = c2 + e3(shown in bold in the standard array)

    • The most likely codeword is the one at the top of the column containing c2 + e3

    • The corresponding error pattern is the cosetleader at the start of the row containing c2 + e3

    • Can be implemented using a look-up table (ROM) which maps all words in the array to the most likely codeword (the one at the top of the column containing the received word).

    21

  • Standard Array for the (5,2,3) Code

    22

  • Syndromes• For a received word r, we need a simpler method

    to determine the codeword estimate

    • To do this we use the Syndrome, s of a received word r

    s = rHT

    • If c was corrupted by an error vector, e, then

    r = c + e

    and

    s = rHT =(c + e) HT = cHT + eHT

    s = 0 + eHT

    • The syndrome depends only on the error vector e

    23

  • Syndromes and Cosets

    Definition Suppose H is a parity-check matrix of an (n,k) code C. Then for any r V(n,q) the following word is called the syndrome of r

    S(r) = rHT

    Lemma Two words have the same syndrome iffthey are in the same coset.

    Proof

    S(r) = rHT = (c+e)HT = eHT

    and e is the coset leader

    24

  • Syndromes• We can add the same error pattern to different

    codewords and get the same syndrome.– There are 2(n - k) syndromes but 2n error patterns

    – Example: for a (5,2) code there are 8 syndromes and 32 error patterns

    – Example: a (7,4) code has 8 syndromes and 128 error patterns• 8 syndromes provides values to indicate single errors in any of

    the 7 bit positions as well as the zero value to indicate no errors

    • Only need to determine which error pattern corresponds to the syndrome

    25

  • Syndromes for the (5,2,3) Code

    26

  • Syndrome Decoding

    • This block diagram shows the syndrome decoder implementation

    Compute syndrome

    Look-up table

    +r

    s e

    c

    27

  • Syndrome Decoding• For systematic linear block codes

    G = [ I | P] so H = [-PT | I]

    • Example (7,4,3) code

    1111000

    0110100

    1010010

    1100001

    P|I G

    1001011

    0101101

    0011110

    I|P- H T

    28

  • Syndrome Decoding - Example• For a received word r = (1101001)

    • So r is a valid codeword

    000

    100

    010

    001

    111

    011

    101

    110

    1001011rH s T

    29

  • Syndrome Decoding - Example• The same codeword, this time with an error

    in the first bit position r = (1101000)

    100

    100

    010

    001

    111

    011

    101

    110

    0001011rH s T

    • Syndrome 001 indicates an error in bit 7 of the received word

    30

  • Single Error Correcting Codes

    31

  • Hamming Codes• (7,4,3) Hamming code

    1111000

    0110100

    1010010

    1100001

    P|I G

    1001011

    0101101

    0011110

    I|P- H T

    32

  • Hamming Code Parameters

    33

  • Comments about H• The minimum distance of the code is equal

    to the minimum number of columns of H which sum to zero

    • For any codeword c

    0d...dd

    d

    .

    d

    d

    ],...,,[cH 1n11100

    1

    1

    0

    110

    T

    n

    n

    n cccccc

    where do, d1, …, dn-1 are the column vectors of H

    • cHT is a linear combination of the columns of H

    34

  • Comments about H

    • For a codeword of weight w (w ones), cHT is a linear combination of w columns of H.

    • Thus we have a one-to-one mapping between weight w codewords and linear combinations of w columns of H that sum to 0.

    • The minimum value of w which results in cHT=0, i.e., codeword c with weight w,determines that dmin = w

    35

  • Example

    • For the (7,4,3) code, a codeword with weight dmin = 3 is given by the first row of G, i.e., c = (1000011)

    • The linear combination of the first and last 2 columns in H gives

    (011)+(010)+(001) = 0

    • Thus a minimum of 3 columns (= dmin) are required to get a zero value for cHT

    36

  • 37

    Hamming codesDefinitionDefinition Let m be an integer and H be an m (2m - 1) matrix with columns which are the non-zero distinct words from V(m,2). The code having H as its parity-check matrix is called a binary Hamming code and is denoted by Ham(m,2).

    The Hamming code Ham(m,2) is a (2m - 1, 2m – 1 – m,3) code

    m = n - k

    The coset leaders are precisely the words of weight 1. The syndrome of the word 0…010…0 with 1 in j -th position and 0 otherwise is the transpose of the j -th column of H.

    111101

    0112,2

    GHHam

  • 38

    Decoding Hamming Codes

    For the case that the columns of H are arranged in order of increasing binary numbers that represent the column numbers 1 to 2m - 1

    • Step 1 Given r compute the syndrome S(r) = rHT

    • Step 2 If S(r) = 0, then r is assumed to be the codeword sent

    • Step 3 If S(r) 0, then assuming a single error, S(r) gives the binary position of the error

  • 39

    ExampleFor the Hamming code given by the parity-check matrix

    the received wordr = 1101011

    has syndromeS(r) = 110

    and therefore the error is in the sixth position.

    Hamming codes were originally used to deal with errors in long-distance telephone calls.

    1010101

    1100110

    1111000

    H

  • The Main Coding Theory Problem

    A good (n,M,d) code has small n, large M and large d.

    The main coding theory problem is to optimize one of the parameters n, M, d for given values of the other two.

    For linear codes, a good (n,k,d) code has small n, large k and large d.

    The main coding theory problem for linear codes is to optimize one of the parameters n, k, d for given values of the other two.

    40

  • Hamming Bound

    The number of word of length n and weight i is

    !( )

    !( )!

    ni

    n

    i n i

    n n n n0 1 2 t

    Let be a word of length , for 0 , the number

    of words of length a distance at most from is

    ( ) ( ) ( ) .... ( )

    v n t n

    n t v

    41

  • Hamming Bound• For C a code of length n, distance d = 2t+1 or 2t+2

    • Example: Give an upper bound on the size of a linear code C of length n=6 and distance d=3

    • This gives but the size of a linear code C must be a power of 2 so

    n

    n n n n

    0 1 2 t

    2C

    ( ) ( ) ( ) .... ( )

    62 64C

    6 6 7

    0 1

    | | 8C

    | | 9C

    42

  • Hamming Bound for Binary Linear Codes

    • A code that meets the Hamming bound with equality is called perfect

    • Which codes meet this bound?

    43

    k n n n n n

    0 1 2 t

    t n n-k

    i i=0

    2 ( ) ( ) ( ) .... ( ) 2

    ( ) 2

  • Hamming Bound for Nonbinary Linear Codes

    44

    k n n n 2 n t n

    0 1 2 t

    t n i n-k

    i i=0

    q ( ) ( )(q 1) ( )(q 1) .... ( )(q 1) q

    ( )(q 1) q

  • Golay Codes

    • Marcel Golay considered the problem of perfect codes in 1949

    • He found three possible solutions to equality for the Hamming bound– q = 2, n = 23, t = 3

    – q = 2, n = 90, t = 2

    – q = 3, n = 11, t = 2

    • Only the first and third codes exist [Van Lint and Tietäväinen, 1973]

    45

  • • There exists a code of length n, distance d, and

    M codewords with

    • The bound also holds for linear codes

    – Let k be the largest integer such that

    then an (n,k,d) code exists

    Gilbert Bound

    n

    d-1 n

    jj=0

    2M=

    ( )

    nk

    d-1 n

    jj=0

    22

    ( )

    46

  • • For linear codes, the Gilbert bound can be improved

    • There exists a linear code of length n, dimension k and

    distance d if

    • Proof: construct a parity check matrix based on this condition.

    • Thus if k is the largest integer such that

    then an (n,k,d) code exists

    Gilbert-Varshamov Bound

    n-1 n-1 n-1 n-k 0 1 d-2( )+( )+.....+( ) 2

    nk

    d-2 n-1

    jj=0

    22

    ( )

    47

  • • Does there exist a linear code of length n=9, dimension k=2, and distance d=5? Yes, because

    • Give a lower and an upper bound on the

    dimension, k, of a linear code with n=9 and d=5.

    • G-V lower bound: but |C| is a power of 2

    so |C| ≥ 4

    Example

    292128933

    8

    2

    8

    1

    8

    0

    8

    9k 22 5.55

    93

    48

  • Example (Cont.)• Hamming upper bound:

    |C| ≤

    but |C| is a power of 2 so |C| ≤ 8• From the tables, the optimal codes are

    – (9,2,6) • so the G-V bound is exceeded – sufficient but not necessary

    – (9,3,4)• so the Hamming bound is necessary but not sufficient for a

    code to exist

    49

    13.113691

    512

    2

    9

    1

    9

    0

    9

    29

  • • Does a (15, 7, 5) linear code exist?

    • Check the G-V bound

    • G-V bound does not hold, so it does not tell us whether or

    not such a code exists.

    • But actually such a code does exist - the (15,7,5) BCH code

    G-V Bound and Codes

    2562247036491141

    3

    14

    2

    14

    1

    14

    0

    14

    2

    1...

    1

    1

    0

    1

    715

    kn

    d

    nnn

    50

  • The Nordstrom-Robinson Code

    • Adding an overall parity check to the (15,7,5) code gives a (16,7,6) linear code

    – This code is optimal

    – The G-V bound says a (16,5,6) code exists

    • A (16,256,6) nonlinear code exists

    – Twice as many codewords as the best linear code

    51

  • Singleton Bound• Singleton bound (upper bound)

    For any (n, k, d) linear code, d-1≤ n-k

    k ≤ n-d+1 or |C| ≤ 2n-d+1

    Proof: the parity check matrix H of an (n,k,d) linear code is an n-k by n matrix such that every d-1 columns of H are independent Since the columns have length n-k, we can never have more than n-k independent columns. Hence d-1 ≤ n-k.

    • Theorem

    For an (n, k, d) linear code C, the following are equivalent:• d = n-k+1

    • Every n-k columns of the parity check matrix are linearly independent

    • Every k columns of the generator matrix are linearly independent

    • C is Maximum Distance Separable (MDS) (definition: if d=n-k+1)

    52

  • Reed-Muller Codes

    • Parameters:

    • Example: m = 3, r = 1

    n = 8, d = 4, k = 1 + 3 = 4

    (8,4,4) code

    53

    r

    m m-r

    i=0

    n=2 d=2 k= ( )mi

    1 1 1 1 1 1 1 1

    0 0 0 0 1 1 1 1

    0 0 1 1 0 0 1 1

    0 1 0 1 0 1 0 1

    G

  • Reed-Muller Codes• r-th order code, length 2m, 0≤r≤m, RM(r, m)

    • RM(0, m) = {00…0, 11…1} → repetition code

    • RM(m, m) = V(2,m) → vector space

    • RM(r, m) =

    • Examples

    {( , ) | ( , 1), ( 1, 1)},0x x y x RM r m y RM r m r m

    2

    4

    (0, 0) {0,1}

    (0,1) {00,11} (1,1) {00, 01,10,11}

    (0,2) {0000,1111} (2,2)

    (1,2) {( , ) | {00, 01,10,11}, {00,11}}

    RM

    RM RM K

    RM RM K

    RM x x y x y54

  • Reed-Muller Codes

    • RM(1,2) → (4,3,2) code

    • Codewords: 0000 0011

    0101 0110

    1010 1001

    1111 1100

    55

    1 1 1 1

    0 0 1 1

    0 1 0 1

    G

  • • Example: G(1,3)

    ( , 1) ( , 1)0

  • • RM(r-1,m) is contained in RM(r,m), r>0

    • Example: m=3, r=1

    • RM(1,3) → (8,4,4) extended Hamming code contains RM(0,3) → (8,1,8) repetition code

    • The dual code of RM(r,m) is RM(m-1-r,m) r< m

    • Example: m=3,r=0

    • RM(0,3) → (8,1,8) repetition code

    • RM(2,3) → (8,7,2) SPC code

    Properties of Reed-Muller Codes

    57

  • 1971: Mariner 9Mariner 9 used a (32,6,16) Reed-Muller code to transmit its grey images of Mars.

    camera rate:100,000 bits/second

    transmission speed:16,000 bits/second

    58

  • 1979+: Voyager I and IIVoyager I and II used a [24,12,8] extended Golay code to send its color images of Jupiter and Saturn.

    Voyager 2 traveled further to Uranus and Neptune. Because of the higher error rate it switched to a more robust Reed-Solomon code.

    59

  • Modern Codes

    More recently Turbo codeswere invented, and are used in 3G cell phones, (future) satellites,and in the Cassini-Huygens spaceprobe [1997–].

    Other modern codes: Expander, Fountain, RaptorLDPC are being proposed for future communications standards

    60

  • NASA Error Correcting CodesIm

    pe

    rfe

    ctn

    ess

    , dB

    -Sig

    nal

    to

    No

    ise

    Rat

    io

    61