140 likes | 142 Vues
Math 221. Huffman Codes. Suppose you have a file…. Represent the Code with a Tree. 1. 0. 0. 1. 1. 0. 0. 0. 1. 0. 0. 1. 1. a. e. i. t. sp. nl. s. Some Terminology.
E N D
Math 221 Huffman Codes
Represent the Code with a Tree 1 0 0 1 1 0 0 0 1 0 0 1 1 a e i t sp nl s
Some Terminology • A tree is a collection of nodes where any path that ends at the same node it started from must intersect itself, i.e. it has no “closed circuits”. • A node with no edges coming out of it is a leaf. • A node connected to an above node is a child of the above node. • We will only consider binary trees, i.e. each node will have at most two children. • Convention: 0 means to to the left, 1 to the right.
Important If a code is represented by the leaves of a binary tree, then a binary string can be uniquely decoded!
Improving the Code • Notice that the newline does not have a sibling. • Thus, we can place it in its parent node and get a shorter code! • This shortens the number of bits needed to represent a newline, from three to two.
Huffman’s Algorithm • Every node is given its frequency as a weight. • Join the two nodes with lowest weight. • Now we have a tree. In this algorithm the weight of a tree is the sum of the weights of its leaves. • Now, at the nth stage, join the two trees with the lowest weight.
10 10 15 15 12 12 3 3 4 4 13 13 1 1 a a e e i i t t sp sp nl nl s s Our example We start with which becomes 4 T1
10 15 12 3 4 13 1 a e i t sp nl s which becomes 8 T2 4 T1
10 15 12 3 4 13 1 a e i t sp nl s which becomes 18 T3 8 T2 4 T1
25 T4 10 15 12 3 4 13 1 a e i t sp nl s which becomes 18 T3 8 T2 4 T1
25 T4 10 15 12 3 4 13 1 a e i t sp nl s which becomes 33 T5 18 T3 8 T2 4 T1
25 T4 10 15 12 3 4 13 1 a e i t sp nl s 58 which becomes T6 33 T5 18 T3 8 T2 4 And we are done! T1