Information Theory

Information Theory Linear Block Codes Jalal Al Roumy

Hamming distance The intuitive concept of “closeness'' of two words is formalized throughHamming distance d (x,y) of words x, y. For two words (or vectors) x, y; d (x,y) = the number of symbols x and y differ. Example: d (10101,01100)=3,d (first, second, fifth)=3 Properties of Hamming distance (1) d (x,y) = 0; iff x = y (2) d (x,y) = d (y, x) (3) d (x,z) ≤ d (x,y) + d (y,z)triangleinequality An important parameter of codes C is their minimal distance. d (C) = min{d (x,y)|x, yεC, x≠y}, because it gives the smallest number of errors needed to change one codeword into another. Theorem Basic error correcting theorem (1)A code C can detect up to s errors if d (C) ≥s+1. (2) A code C can correct up to t errors if d (C) ≥2t+1. Note – for binary linear codes d (C) = smallest weight W (C) of non-zero codeword,

Some notation Notation: An (n,M,d) - code C is a code suchthat • n - is the length of codewords. • M - is the number of codewords. • d - is the minimum distance in C. Example: C1={00,01,10,11} is a (2,4,1)-code. C2={000,011,101,110} is a (3,4,2)-code. C3={00000,01101,10110,11011} is a (5,4,3)-code. Comment: A good (n,M,d) code has small n and large M and d.

Code Rate For q-nary (n,M,d)-code we define code rate, or information rate, R, by The code rate represents the ratio of the number of input data symbols to the number of transmitted code symbols. For a Hadamard code eg, this is an important parameter for real implementations, because it shows what fraction of the bandwidth is being used to transmit actual data. Recall that log2(n) = ln(n)/ln(2)

Equivalence of codes DefinitionTwo q-ary codes are called equivalent if one can be obtained from the other by a combination of operations of the following type: (a) a permutation of the positions of the code. (b) a permutation of symbols appearing in a fixed position. Let a code be displayed as an M´n matrix. To what correspond operations (a) and (b)? Distances between codewords are unchanged by operations (a), (b). Consequently, equivalent codes have the same parameters (n,M,d) (and correct the same number of errors). Examples of equivalent codes Lemma Any q-ary (n,M,d)-code over an alphabet {0,1,…,q-1} is equivalent to an (n,M,d)-code which contains the all-zero codeword 00…0.

The main coding theory problem A good (n,M,d)-code has small n, large M and large d. The main coding theory problem is to optimize one of the parameters n,M,d for given values of the other two. Notation:Aq(n,d) is the largest M such that there is an q-nary (n,M,d)-code.

Introduction to linear codes

Linear Block Codes • Information is divided into blocks of length k • r parity bits or check bits are added to each block (total length n = k + r),. • Code rate R = k/n • Decoder looks for codeword closest to received vector (code vector + error vector) • Tradeoffs between • Efficiency • Reliability • Encoding/Decoding complexity

Parity check matrix HT Code Vector C Null vector 0 Message vector m Generator matrix G Code Vector C Operations of the generator matrix and the parity check matrix Linear Block Codes The parity check matrix H is used to detect errors in the received code by using the fact that c * HT = 0 ( null vector) Letx = c ebe the received message; c is the correct code and e is the error Compute S = x * HT =( c e ) * HT =c HT e HT = e HT If S is 0 then message is correct else there are errors in it, from common known error patterns the correct message can be decoded.

Linear Block Codes • Linear Block Code The block length C of the Linear Block Code is C = m G where m is the information codeword block length, G is the generator matrix. G = [Ik| P] k × n, I is unit matrix. • The parity check matrix H = [PT| In-k ], where PT is the transpose of the matrix p.

Forming the generator matrix The generator matrix is formed from the list of codewords by ignoring the all zero vector and the linear combinations; eg

Equivalent linear [n,k]-codes Two k x n matrices generate equivalent linear codes over GF(q) if one matrix can be obtained from the other by a sequence of operations of the following types: (R1) permutation of rows (R2) multiplication of a row by a non-zero scaler (R3) Addition of a scaler multiple of one row to another (C1) Permutation of columns (C2) Multiplication of any column by a non-zero scaler The row operations (R) preserve the linear independence of the rows of the generator matrix and simply replace one basis by another of the same code. The column operations (C) convert the generator matrix to one for an equivalent code.

Transforming the generator matrix Transforming to the form G = [Ik| P]

Encoding with the generator Codewords = message vector u x G For example, where

Parity-check matrix A parity check matrix H for an [n, k]-code C is and (n - k) x n matrix such that x. HT = 0 iff x  C. A parity-check matrix for C is a generator matrix for the duel code C . If G = [Ik| A] is the standard form generator matrix for an [n, k]-code C, then the parity-check matrix for C is H = [-AT| In-k ]. A parity check matrix of the form [B| In-k ] is said to be in standard form.

Decoding using Slepian matrix An elegant nearest-neighbour decoding scheme was devised by Slepian in 1960. • every vector in V(n, q) in in some coset of C • every coset contains exactly qk vectors • two cosets are either disjoint or coincide

Syndrome decoding Suppose C is a q-ary [n, k]-code with the parity-check matrix H. For any vector y = V(n, q), the row vector S(y) = y HT is called the syndrome of y. Two vectors have the same syndromes iff they lie in the same coset.

Decoding procedure The rules:

Example

Information Theory