Understanding Data Compression and Error Detection: Lecture Summary and Homework

Todays Lecture (Feb 15, 2000) • Reading and Homework Assignments • Summary of Last Lecture • Start Computers • Data Compression • Huffman Code • More on Encryption

H/W and Reading • Homework Assignment #1 • Problems 7.1, 7.2, 7.3, 7.7 • Due Thursday Feb 17th • Text Sections to Date • Chapter 1 • Section 2.5 (part about UPC scanner) • Sections 7.1, 7.2, 7.3, 7.7, 7.12 • Text Sections for today’s lecture • Section 7.9, 7,10

Current Events

Summary of Last Lecture • Redundancy can be introduced in a code to detect inevitable bit errors which may result from imperfect communication or sensor scan. This is called error detection and correction coding. • Bit errors can be detected and/or corrected by: • Bit repetition • Parity • All schemes presented work ONLY if one bit is in error

Even and Odd Parity • There are two type of parity • Even • Parity bit is 1 if an odd number of ones are present in the original codeword • Odd • Parity bit is 0 if an odd number of ones are present in the original codeword • Note in example 7.4 of text, “odd” parity is used on the columns

Two Types of Codes • Variable Length • Codewords are a variable number of bits • Fixed length • Codewords are a fixed number of bits • The UPC and ASCII codes are examples of fixed length codes

Huffman Coding • Goal: Develop a code to reduce the total number of bits required to encode information. This is called Compression • Idea: Assign most frequently occurring symbols short codewords and less frequently occurring symbols long codewords

Tree Structure Root (Starting Point) Branches Node Leaves

Huffman Code Tree Example 1 0 1 0 A (1) 1 0 B (01) C (001) D (000) • Symbols are leaves of the tree • Codeword is identified by starting at root and branching to leaf

Huffman Encoding Example • Consider the Symbol Sequence AABAD • Problem: Encode the Symbol Sequence using the Huffman Code just presented

Solution A 1 B 01 C 001 D 000 Huffman Code Encode AABAD 11011000 Thus, AABAD is represented by 8 bits

Compare to Fixed Length Code A 00 B 01 C 10 D 11 Code Encode AABAD 0000010011 Thus, AABAD is represented by 10 bits

Huffman Decoding Example

Real World Histograms • Figure 7.19 in Text

Do Problem 7.19

Do Problem 7.20

Data Compression • Lossless Compression • Techniques retain all the information present in the original data • Example: Huffman encoding, run-length encoding • Lossy Compression • techniques allow some loss of the original data • Example: MPEG video compression

Understanding Data Compression and Error Detection: Lecture Summary and Homework