420 likes | 557 Vues
Lecture 11. Representing Sets (2.3.3) Huffman Encoding Trees (2.3.4). The Abstract Data Type Set. A set is a collection of distinct items. The following are allowable set operations. ( empty-set? set) (make-empty- set) ( element-of-set? x set) (adjoin-set x set) (union-set s1 s2)
 
                
                E N D
Lecture 11 Representing Sets (2.3.3) Huffman Encoding Trees (2.3.4) מבוא מורחב - שיעור 11
The Abstract Data Type Set A set is a collection of distinct items. The following are allowable set operations. (empty-set? set) (make-empty-set) (element-of-set? x set) (adjoin-set x set) (union-set s1 s2) (intersection-set s1 s2) Contract: the operations have the usual meaning of set operations. Set  Boolean  Set Item × Set  Boolean Item × Set  Set Set × Set  Set Set × Set  Set מבוא מורחב - שיעור 11
Implementing Sets • Must decide on a Representation • How a set is represented. • Then must write an implementation • Write the code of the operations. • Each operation can be written separately We will take a look at several alternative representations/implementations מבוא מורחב - שיעור 11
Version 1: Unordered List Representation: a set is represented as a list. No duplicates are allowed. Implementation: (define (empty-set? set)(null? set)) (define (make-empty-set)'()) מבוא מורחב - שיעור 11
Version 1: element-of-set? (define (element-of-set? x set) (cond ((null? set) #f) ((equal? x (car set)) #t) (else (element-of-set? x (cdr set))))) equal? : Like eq? for symbols.Compares the contents of lists and trees. Can be applied for numbers and strings. (eq? (list 'a 'b) (list 'a 'b)) (equal? (list 'a 'b) (list 'a 'b)) #f #t מבוא מורחב - שיעור 11
Version 1: Adjoin-set (define (adjoin-set x set) (if (element-of-set? x set) set (cons x set))) מבוא מורחב - שיעור 11
Version 1: Intersection (define (intersection-set set1 set2) (cond ((or (null? set1) (null? set2)) '()) ((element-of-set? (car set1) set2) (cons (car set1) (intersection-set (cdr set1) set2))) (else (intersection-set (cdr set1) set2)))) Or is this better? (define (intersection-set set1 set2) (cond ((or (null? set1) (null? set2)) '()) ((element-of-set? (car set1) set2) (adjoin-set (car set1) (intersection-set (cdr set1) set2))) (else (intersection-set (cdr set1) set2)))) מבוא מורחב - שיעור 11
Version 1: Union (define (union-set set1 set2) (cond ((null? set1) set2)) ((not (element-of-set? (car set1) set2)) (cons (car set1) (union-set (cdr set1) set2))) (else (union-set (cdr set1) set2)))) Is the alternative better this time? (define (union-set set1 set2) (cond ((null? set1) set2)) (else (adjoin-set (car set1) (union-set (cdr set1) set2))))) מבוא מורחב - שיעור 11
Version 1: Time Complexity Suppose a set contains n elements Element-of-set Adjoin-set Intersection-set Union-set (n) (n) (n2) (n2) מבוא מורחב - שיעור 11
Version 2: Ordered List Representation: a set (of numbers) is represented as an ordered list without duplicates. (define (element-of-set? x set) (cond ((null? set) #f) ((= x (car set)) #t) ((< x (car set)) #f) (else (element-of-set? x (cdr set))))) Time complexity: (n) You will implement adjoin-set yourself. empty-set? and make-empty-set are the same. מבוא מורחב - שיעור 11
Version 2: Intersection intersection-set from version 1 will work here. (define (intersection-set set1 set2) (cond ((or (null? set1) (null? set2)) '()) ((element-of-set? (car set1) set2) (cons (car set1) (intersection-set (cdr set1) set2))) (else (intersection-set (cdr set1) set2)))) But, its complexity is (n2).Can we do it better ? מבוא מורחב - שיעור 11
Version 2: Better Intersection (define (intersection-set set1 set2) (if (or (null? set1) (null? set2)) '() (let ((x1 (car set1)) (x2 (car set2))) (cond ((= x1 x2) (cons x1 (intersection-set (cdr set1) (cdr set2)))) ((< x1 x2) (intersection-set (cdr set1) set2)) (else (intersection-set set1 (cdr set2))))))) מבוא מורחב - שיעור 11
Version 2: Intersection Example set1 set2 intersection (1 3 7 9) (1 4 6 7) (1 (3 7 9) (4 6 7) (1 (7 9) (4 6 7) (1 (7 9) (6 7) (1 (7 9) (7) (1 (9) () (1 7) Time and space  (n) Union -- similar מבוא מורחב - שיעור 11
Complexity unordered ordered (1) (1) empty-set? make-empty-set element-of-set adjoin-set intersection-set union-set (1) (1) (n) (n) (n) (n) (n2) (n) (n2) (n) מבוא מורחב - שיעור 11
Version 3: Binary Trees Binary Search: Lion in the desert מבוא מורחב - שיעור 11
7 9 3 12 5 1 Version 3: Binary Trees Store the elements in the nodes of a binary tree. The values in the left subtree of a node v are all smaller than the value stored at v. The values in the right subtree of a node v are all larger than the value stored at v. A possible representation of the set {1,3,5,7,9,12} : מבוא מורחב - שיעור 11
3 7 7 1 9 3 5 9 12 5 1 12 Version 3: Binary Trees A set has many representations: Height= (log n) Balanced Tree Unbalanced Tree מבוא מורחב - שיעור 11
7 9 3 12 5 1 Version 3: Representation (define (make-tree entry left right) (list entry left right)) (define (entry tree) (car tree)) (define (left-branch tree) (cadr tree)) (define (right-branch tree) (caddr tree)) מבוא מורחב - שיעור 11
Version 3: Element-of-set (define (element-of-set? x set) (cond ((null? set) #f) ((= x (entry set)) true) ((< x (entry set)) (element-of-set? x (left-branch set))) (else (element-of-set? x (right-branch set))))) Complexity: (h), where h is the height of the tree. If tree is balanced, then h  log(n) In the worst case, h  n מבוא מורחב - שיעור 11
Version 3: Adjoin-set (define (adjoin-set x set) (cond ((null? set) (make-tree x '() '())) ((= x (entry set)) set) ((< x (entry set)) (make-tree (entry set) (adjoin-set x (left-branch set)) (right-branch set))) (else (make-tree (entry set) (left-branch set) (adjoin-set x (right-branch set)))))) Complexity: (h), where h is the height of the tree. מבוא מורחב - שיעור 11
Complexity We omit the trivial operations. trees unordered ordered Element-of-set Adjoin-set Intersection-set Union-set (h) (n) (n) (h) (n) (n) (n) (n2) (n) (n) (n) (n2) If a tree is roughly balanced, then h  log(n). Main challenge: Keep the trees roughly balanced.(More on this next term in “Data Structures”.) מבוא מורחב - שיעור 11
Random trees are fairly balanced (define (rand-tree n range) (if (= n 0) '() (adjoin-set (random range) (rand-tree (- n 1) range)))) (define (height tree) (if (null? tree) 0 (+ 1 (max (height (left-branch tree)) (height (right-branch tree)))))) (height (rand-tree 1000 1000000))  (height (rand-tree 10000 1000000))  ~ 22 ~ 31 • average over several runs. • height of a balanced tree: ~ 10 , ~13 resp. מבוא מורחב - שיעור 11
Huffman encoding trees מבוא מורחב - שיעור 11
Data Transmission “sos” Bob Alice We wish to send information efficiently from Alice to Bob Morse code not necessarily the most efficient you could think of מבוא מורחב - שיעור 11
Fixed Length Codes Represent data as a sequence of 0’s and 1’s Sequence: BACADAEAFABBAAAGAH A fixed length code (ASCII): A000 B 001 C 010 D 011 E 100 F 101 G 110 H 111 Encoding of sequence: 001000010000011000100000101000001001000000000110000111 The Encoding is 18x3=54 bits long.Can we make the encoding shorter? מבוא מורחב - שיעור 11
42 bits (20% shorter) Variable Length Code Make use of frequencies. Frequency of A=8, B=3, others1. A 0 B 100 C 1010 D 1011 E 1100 F 1101 G 1110 H 1111 Example: BACADAEAFABBAAAGAH 100010100101101100011010100100000111001111 But how do we decode? מבוא מורחב - שיעור 11
0 1 0 1 A 0 1 0 1 0 1 1 0 0 1 B C D E F G H Prefix code  Binary tree Prefix code: No codeword is a prefix of any other codeword A 0 B 100 C 1010 D 1011 E 1100 F 1101 G 1110 H 1111 מבוא מורחב - שיעור 11
0 1 0 1 A 0 1 0 1 0 1 1 0 0 1 B C D E F G H Decoding Example 10001010 10001010 B 10001010 BA 10001010 BAC מבוא מורחב - שיעור 11
Abstract representation of code trees Constructors: make-leaf - Construct a leaf make-code-tree - Construct a code tree Predicates: leaf? - Is leaf? Selectors: left-branch - Select left branch right-branch - Select right branch symbol-leaf - the symbol attched to leaf מבוא מורחב - שיעור 11
Decoding a Message (define (decode bits tree) (define (decode-one bits current-branch) (if (null? bits) '() (let ((next-branch (choose-branch (car bits) current-branch))) (if (leaf? next-branch) (cons (symbol-leaf next-branch) (decode-one (cdr bits) tree)) (decode-one (cdr bits) next-branch))))) (decode-one bits tree)) (define (choose-branch bit branch) (cond ((= bit 0) (left-branch branch)) ((= bit 1) (right-branch branch)) (else (error "bad bit -- CHOOSE-BRANCH" bit)))) מבוא מורחב - שיעור 11
0 1 8 0 1 A 0 1 0 1 3 0 1 1 0 0 1 B 1 1 1 1 1 1 C D E F G H Huffman Tree = Optimal Length Code Optimal: no code has better weighted average length מבוא מורחב - שיעור 11
Representation {A,B,C,D,E,F,G,H} 17 A 8 9 {B,C,D,E,F,G,H} 4 5 {B,C,D} {E,F,G,H} 2 2 B 2 3 {C,D} {E,F} {G,H} C D E F G H 1 1 1 1 1 1 מבוא מורחב - שיעור 11
Representation (Cont.) (define (make-leaf symbol weight) (list 'leaf symbol weight)) (define (leaf? object) (eq? (car object) 'leaf)) (define (symbol-leaf x) (cadr x)) (define (weight-leaf x) (caddr x)) מבוא מורחב - שיעור 11
Representation (Cont.) (define (make-code-tree left right) (list left right (append (symbols left) (symbols right)) (+ (weight left) (weight right)))) (define (left-branch tree) (car tree)) (define (right-branch tree) (cadr tree)) מבוא מורחב - שיעור 11
Representation (Cont.) (define (symbols tree) (if (leaf? tree) (list (symbol-leaf tree)) (caddr tree))) (define (weight tree) (if (leaf? tree) (weight-leaf tree) (cadddr tree))) מבוא מורחב - שיעור 11
Huffman’s Algorithm Build tree bottom-up, so that lowest weight leaves are farthest from the root. Repeatedly: Find two trees of lowest weight. merge them to form a new tree whose weight is the sum of their weights. מבוא מורחב - שיעור 11
{A,B,C,D,E,F,G,H} 17 9 {B,C,D,E,F,G,H} 4 {E,F,G,H} 5 {B,C,D} 2 2 2 {C,D} {G,H} {E,F} C D E F G B A H 1 1 1 1 3 1 1 8 Construction of Huffman tree מבוא מורחב - שיעור 11
Construction of Huffman Tree Initial leaves {(A 8) (B 3) (C 1) (D 1) (E 1) (F 1) (G 1) (H 1)} Merge {(A 8) (B 3) ({C D} 2) (E 1) (F 1) (G 1) (H 1)} Merge {(A 8) (B 3) ({C D} 2) ({E F} 2) (G 1) (H 1)} Merge {(A 8) (B 3) ({C D} 2) ({E F} 2) ({G H} 2)} Merge {(A 8) (B 3) ({C D} 2) ({E F G H} 4)} Merge {(A 8) ({B C D} 5) ({E F G H} 4)} Merge {(A 8) ({B C D E F G H} 9)} Final merge {({A B C D E F G H} 17)} מבוא מורחב - שיעור 11
left-branch symbols weight right-branch Construction of Huffman tree (generate-huffman-tree '((A 8) (B 3) (C 1) (D 1) (E 1) (F 1) (H 1) (G 1)) ((leaf a 8)((((leaf g 1) (leaf h 1) (g h) 2) ((leaf f 1) (leaf e 1) (f e) 2) (g h f e) 4) (((leaf d 1) (leaf c 1) (d c) 2) (leaf b 3) (d c b) 5) (g h f e d c b) 9)(a g h f e d c b)17) מבוא מורחב - שיעור 11
Construction of Huffman tree Sort pairs Ordered insert (define (generate-huffman-tree pairs) (successive-merge (make-leaf-srt-lst pairs))) (define (make-leaf-srt-lst pairs) (if (null? pairs) '() (let ((pair (car pairs))) (insert-srt (make-leaf (car pair) (cadr pair)) (make-leaf-srt-lst (cdr pairs)))))) מבוא מורחב - שיעור 11 40
Construction of Huffman tree (define (insert-srt x s-lst) (cond ((null? s-lst) (list x)) ((< (weight x) (weight (car s-lst)))(cons x s-lst)) (else (cons (car s-lst) (insert-srt x (cdr s-lst)))))) (define (successive-merge trees) (if (null? (cdr trees)) (car trees) (let ((smallest (car trees)) (2smallest (cadr trees)) (rest (cddr trees))) (successive-merge (insert-srt (make-code-tree smallest 2smallest) rest))))) מבוא מורחב - שיעור 11 41
Summary We saw today that even a seemingly simple abstract concept like a set, when implemented on a computer, can give rise to many implementations, each with a different complexity. מבוא מורחב - שיעור 11