Information and Coding Theory Introduction

Information and Coding Theory Introduction

Lecture times Tuesdays Thursdays 04.02. 12:30 06.02. 10:30 11.02. 12:30 13.02. 10:30 18.02. 12:30 20.02. 10:30 25.02. 12:30 27.02. 10:30 04.03. 12:30 06.03. 10:30 11.03. 12:30 13.03. 10:30 18.03. 12:30 20.03. 10:30 25.03. 12:30 27.03. 10:30 (???) Some other changes might be possible, but at least currently are not being planned. Exam session: 31.03.2014 – 11.04.2014.

Information theory One of few fields with identifiable beginning: A Mathematical Theory of Communication Bell Systems Technical Journal C.Shannon, 1948 The Mathematical Theory of Communication C.Shannon and W.Weaver, 1949 Claude Elwood Shannon IT courses become very popular in universities, until the subject become too broad. The goodness of term “IT” disputable (communication?). First applications: space communications, military. End of the road? ~ 1971 - lack of suitable hardware ~ 2001 - in some cases we already have achieved theoretical limits

Information theory http://www.vf.utwente.nl/~neisser/public/reject.pdf

Information theory

Error correcting codes There is no single “discoverer”, but the first effective codes are due to R.Hamming (around 1950). • Some other popular codes: • - Golay codes (Voyager spacecrafts, around 1980) • Reed-Solomon codes (CDs, DVDs, DSL, • RAID-6, etc) • BCH (Bose & Chaudhuri & Hocquenghem) • codes Richard Hamming The course will be more oriented towards ECC than IT (so, expect more algebra and not that much of probability theory :)

Applications of IT and/or ECC

Applications of IT and/or ECC Voyager 1 Launched 05.09.1977 Now 127 AU from Earth Voyager 2 Launched 20.08.1977 Now 103 AU from Earth Error correction: (24,12,8) Golay code Viterbi-decoded convolutional code, rate 1/2, constraint length k=7 Later concatenation with (255,223) Reed-Solomon codes over GF(256) added

Applications of IT and/or ECC CD (1922): (32,28) + (28,24) RS codes CD-ROM (1989): The same as above + (26,24) + (45,43) RS codes DVD (1995): (208,192) + (182,172) RS codes Blue-ray Disc (2006): (248,216) + (62,30) RS codes (LDC + BIS), “picket” encoding

Applications of IT and/or ECC Error correction can be drive specific. Initially mostly based on Reed-Solomon codes. From 2009 increased use of LDPC (low density parity-check codes) with performance close to Shannon’s limit.

Applications of IT and/or ECC One of the first modems that employed error correction and reached 9600 bps transfer rate. Introduced in 1971. Priced around “only” $11000.

Origins of information theory "The fundamental problem of communication is that of reproducing at one point, either exactly or approximately, a message selected at another point" Shannon, C.E. (1948), "A Mathematical Theory of Communication", Bell System Technical Journal, 27, pp. 379–423 & 623–656, July & October, 1948.

Information transmission

Noiseless channel

Noiseless channel Are there any non-trivial problems concerning noiseless channels? E.g. how many bits we need to transfer a particular piece of information? All possible n bit messages, each with probability 1/2n Receiver Noiseless channel Obviously n bits will be sufficient. Also, it is not hard to guess that n bits will be necessary to distinguish between all possible messages.

Noiseless channel All possible n bit messages. Msg. Prob. 000000... ½ 111111... ½ other 0 Receiver Noiseless channel n bits will still be sufficient. However, we can do quite nicely with just 1 bit!

Noiseless channel All possible n bit messages, the probability of message i being pi. Receiver Noiseless channel n bits will still be sufficient. If all pi > 0 we also will need n or more bits for some messages, since we need to distinguish all of them. But what is the smallest average number of bits per message we can do with? Derived from the Greek εντροπία "a turning towards" (εν- "in" + τροπή "a turning").

Binary entropy function Entropy of a Bernoulli trial as a function of success probability, often called the binary entropy function, Hb(p). The entropy is maximized at 1 bit per trial when the two possible outcomes are equally probable, as in an unbiased coin toss. [Adapted from www.wikipedia.org]

Encoding over noiseless channels The problem. Given set M of n messages, message mi with probability pi, find a code (mapping from M to {0,1}*), such that an average number of bits for message transmission is as small as possible (i.e. code that minimizes W = pi c(mi), where c(mi) is number off bits used for encoding of mi). • What we know about this? • it turns out for any code we will have W  E • there are codes that (up to extent) can approach E arbitrarily close • some codes we will have a closer look at: Huffman codes, • Shannon codes, arithmetic codes

Noisy channel In practice channels are always noisy (sometimes this could be ignored). There are several types of noisy channels one can consider. We will restrict attention to binary symmetric channels.

Noisy channel Some other types of noisy channels. Binary erasure channel

Noisy channel

Noisy channel - the problem Assume BSC with probability of transmission error p. In this case we assume that we have already decided on the optimal string of bits for transmission - i.e. each bit could have value 1 or 0 with equal probabilities ½. We want to maximize our chances to receive a message without errors, to do this we are allowed to modify the message that we have to transmit. Usually we will assume that message is composed of blocks of m bits each, and we are allowed to replace a given m bit block with an n bit block of our choice (likely we should have n  m :) Such replacement procedure we will call a block code. We also would like to maximize the ratio m/n (code rate).

Noisy channel - the problem If p > 0, could we guarantee that message will be received without errors? With probability  pn any number of bits within each block could be corrupted... If we transmit just unmodified block of m bits, the probability of error is 1(1p)m. Can we reduce this? Repetition code: Replace each bit with 3 bits of the same value (0000, 1111). We will have n = 3m and probability or error 1((1p)3 +3p(1p)2)m= 1(13p2 +2p3)m. Note that 1p < 13p2 +2p3, if 0 < p < ½.

Repetition code R3 Probability of error of transmission of single bit using no coding and R3.

Repetition codes Rn R3 - the probability of unrecoverable error is 3p2  2p3 For RN we have: Can we design something better than repetition codes?

Hamming code [7,4] G - generator matrix A (4 bit) message x is encoded as xG, i.e. if x = 0110 then c = xG = 0110011. Decoding? - there are 16 codewords, if there are no errors, we can just find the right one... - also we can note that the first 4 digits of c is the same as x :)

Hamming code [7,4] • What to do, if there are errors? • - we assume that the number of errors is as small as possible - i.e. • we can find the code word c (and the corresponding x) that is • closest to received vector y (using Hamming distance) • consider vectors a = 0001111, b = 0110011 and c = 1010101, • - if y is received, compute ya, yb and yc (inner products), e.g., • for y = 1010010 we obtain ya = 1, yb = 0 and yc = 0. • -- this represents a binary number (100 or 4 in example above) and • we conclude that error is in 4th digit, i.e. x = 1011010. • Easy, bet why this method work?

Hamming code [7,4] No errors - all pi-s correspond to di-s Error in d1,...,d3 - a pair of wrong pi-s Error in d4 - all pi-s are wrong Error in pi - this will differ from error in some of di-s Parity bits of H(7,4) • So: • we can correct any single error • since this is unambiguous, we should be able to detect any 2 errors

Hamming code [7,4] a = 0001111, b = 0110011 and c = 1010101 H - parity check matrix Why it does work? We can check that without errors yH = 000 and that with 1 error yH gives the index of damaged bit... General case: there always exists matrix for checking orthogonality yH = 0. Finding of damaged bits however isn’t that simple.

Block codes • the aim: for given k and n correct as many errors as possible • if minimal distance between codewords is d, we will be able to • correct up to d1 /2 errors. • in principle we can chose any set of codewords, but it is easier to • work with linear codes • decoding still could be a problem • even more restricted and more convenient are class of cyclic codes

Some more complex approaches • - we have formulated the lossless communication problem in terms of • correction of maximal number of bits in each block of [n,k] code and • will study the methods for constructing and analyzing such codes • errors quite often occur in bursts... • it is possible to “spread out” individual blocks (interleaving) • it turns out that better work methods that just try to minimize • transmission errors (without guarantees regarding number of bits) • there are recently developed methods/resources that allows to use • such codes efficiently in practice and they are close to “optimal” • -- low-density parity-check codes (LDPC) • -- turbo codes

Limits of noisy channels Given [n,k] code, we define rate of the code as R = k/n. The aim is to get R as large as possible for a given error correction capacity. Assume BSC with error rate p. Apparently there should be some limits how large the value of R could be achieved. A bit more about entropy. Conditional entropy Mutual information Binary entropy function

Limits of noisy channels Given [n,k] code, we define rate of the code as R = k/n. The aim is to get R as large as possible for a given error correction capacity. Assume BSC with error rate p. Apparently there should be some limits how large the value of R could be achieved.

Channel capacity For BSC there is just a “fixed distribution” defined by p.

Shannon Channel Coding Theorem Shannon’s original proof just shows that such codes exist. With LDPC and turbo codes it is actually possible to approach Shannon’s limit as close as we wish.

Some recent codes Convolutional codes Turbo codes - interleaving (try to combine several reasonably good codes) - feedback (decoding of the next row depends from errors in previous ones) LDPC (Low Density Parity Check) codes

Shannon Channel Coding Theorem

Plans for nearest future :) We will need to start with some facts and results from algebra; to make the most mathematical parts somewhat less intense I propose to mix different subjects a bit: 06.02 (Thu) 10:30 Block codes and linear block codes. Some examples. Groups, fields, vector spaces, codes - basic definitions. 11.02 (Tue) 12:30 Codes - syndrome decoding, some more definitions, again something from algebra :) 13.02 (Thu) 10:30 Entropy - basic definitions, its relation to data compression.

Requirements • 5-6 homeworksIn principle these are short-term - up to 2 weeks deadlineRequirement a bit relaxed this year, stillat least half of homeworks must be submitted before the exam session starts80% of grade • ExamIn written form and will consist of practical exercises and, probably, some theoretical questionsA home-take exam - you get questions and bring it back within 48 hours20% of grade • To qualify for grade 10 you may be asked to copewith some additional question(s)/problem(s)

Academic honesty You are expected to submit only your own work! Sanctions: Receiving a zero on the assignment (in no circumstances a resubmission will be allowed) No admission to the exam and no grade for the course

Textbooks Vera Pless Introduction to the Theory of Error-Correcting Codes Wiley-Interscience, 1998 (3rd ed) Course textbook

Textbooks W.Cary Huffman, Vera Pless Fundamentals of Error-Correcting Codes Cambridge University Press, 2003

Textbooks Neil J. A. Sloane, Florence Jessie MacWilliams The Theory of Error-Correcting Codes North Holland, 1983

Textbooks Juergen Bierbrauer Introduction to Coding Theory Chapman & Hall/CRC, 2004

Textbooks David J. C. MacKay Information Theory, Inference and Learning Algorithms Cambridge University Press, 2007 (6th ed) http://www.inference.phy.cam.ac.uk/ mackay/itila/

Textbooks Somewhat “heavy” on probabilities-related material, but very “user friendly” and recommended for entropy-related topics.

Textbooks Thomas M. Cover, Joy A. Thomas Elements of Information Theory Wiley-Interscience, 2006 (2nd ed)

Textbooks Todd K. Moon Error Correcting Coding Mathematical Methods and Algorithms Wiley-Interscience, 2005

Web page • http://susurs.mii.lu.lv/juris/courses/ict2014.html • It is expected to contain: • short summaries of lectures • announcements • power point presentations • homework problems • frequently asked questions (???) • your grades (???) • other useful information

Information and Coding Theory Introduction