170 likes | 291 Vues
This lecture covers critical concepts in data compression and error detection. We delve into Huffman coding, a method to optimize the representation of data through variable-length codewords based on frequency of occurrence. The discussion includes error detection methods like parity bits, essential for ensuring data integrity during transmission. Relevant homework assignments include problems from the text focusing on these topics. Students are required to explore sections that cover theory, application, and real-world scenarios in data compression and error codes.
E N D
Todays Lecture (Feb 15, 2000) • Reading and Homework Assignments • Summary of Last Lecture • Start Computers • Data Compression • Huffman Code • More on Encryption
H/W and Reading • Homework Assignment #1 • Problems 7.1, 7.2, 7.3, 7.7 • Due Thursday Feb 17th • Text Sections to Date • Chapter 1 • Section 2.5 (part about UPC scanner) • Sections 7.1, 7.2, 7.3, 7.7, 7.12 • Text Sections for today’s lecture • Section 7.9, 7,10
Summary of Last Lecture • Redundancy can be introduced in a code to detect inevitable bit errors which may result from imperfect communication or sensor scan. This is called error detection and correction coding. • Bit errors can be detected and/or corrected by: • Bit repetition • Parity • All schemes presented work ONLY if one bit is in error
Even and Odd Parity • There are two type of parity • Even • Parity bit is 1 if an odd number of ones are present in the original codeword • Odd • Parity bit is 0 if an odd number of ones are present in the original codeword • Note in example 7.4 of text, “odd” parity is used on the columns
Two Types of Codes • Variable Length • Codewords are a variable number of bits • Fixed length • Codewords are a fixed number of bits • The UPC and ASCII codes are examples of fixed length codes
Huffman Coding • Goal: Develop a code to reduce the total number of bits required to encode information. This is called Compression • Idea: Assign most frequently occurring symbols short codewords and less frequently occurring symbols long codewords
Tree Structure Root (Starting Point) Branches Node Leaves
Huffman Code Tree Example 1 0 1 0 A (1) 1 0 B (01) C (001) D (000) • Symbols are leaves of the tree • Codeword is identified by starting at root and branching to leaf
Huffman Encoding Example • Consider the Symbol Sequence AABAD • Problem: Encode the Symbol Sequence using the Huffman Code just presented
Solution A 1 B 01 C 001 D 000 Huffman Code Encode AABAD 11011000 Thus, AABAD is represented by 8 bits
Compare to Fixed Length Code A 00 B 01 C 10 D 11 Code Encode AABAD 0000010011 Thus, AABAD is represented by 10 bits
Real World Histograms • Figure 7.19 in Text
Data Compression • Lossless Compression • Techniques retain all the information present in the original data • Example: Huffman encoding, run-length encoding • Lossy Compression • techniques allow some loss of the original data • Example: MPEG video compression