1 / 47

Analysis & Design of Algorithms (CSCE 321)

Analysis & Design of Algorithms (CSCE 321). Prof. Amr Goneid Department of Computer Science, AUC Part 8. Greedy Algorithms. Greedy Algorithms. Greedy Algorithms. Microsoft Interview From: http://www.cs.pitt.edu/~kirk/cs1510/. Greedy Algorithms. Greedy Algorithms The General Method

lucine
Télécharger la présentation

Analysis & Design of Algorithms (CSCE 321)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Analysis & Design of Algorithms(CSCE 321) Prof. Amr Goneid Department of Computer Science, AUC Part 8. Greedy Algorithms Prof. Amr Goneid, AUC

  2. Greedy Algorithms Prof. Amr Goneid, AUC

  3. Greedy Algorithms Microsoft Interview From: http://www.cs.pitt.edu/~kirk/cs1510/ Prof. Amr Goneid, AUC

  4. Greedy Algorithms • Greedy Algorithms • The General Method • Continuous Knapsack Problem • Optimal Merge Patterns Prof. Amr Goneid, AUC

  5. 1. Greedy Algorithms Methodology: • Start with a solution to a small sub-problem • Build up to the whole problem • Make choices that look good in the short term but not necessarily in the long term Prof. Amr Goneid, AUC

  6. Greedy Algorithms Disadvantages: • They do not always work. • Short term choices may be disastrous on the long term. • Correctness is hard to prove Advantages: • When they work, they work fast • Simple and easy to implement Prof. Amr Goneid, AUC

  7. 2. The General method Let a[ ] be an array of elements that may contribute to a solution. Let S be a solution, Greedy (a[ ],n) { S = empty; for each element (i) from a[ ], i = 1:n { x = Select (a,i); if (Feasible(S,x)) S = Union(S,x); } return S; } Prof. Amr Goneid, AUC

  8. The General method (continued) • Select: Selects an element from a[ ] and removes it.Selection is optimized to satisfy an objective function. • Feasible: True if selected value can be included in the solution vector, False otherwise. • Union: Combines value with solution and updates objective function. Prof. Amr Goneid, AUC

  9. 3. Continuous Knapsack Problem Prof. Amr Goneid, AUC

  10. Continuous Knapsack Problem Environment • Object (i): Total Weight wi Total Profit pi Fraction of object (i) is continuous (0 =< xi <= 1) • A Number of Objects 1 =< i <= n • A knapsack Capacity m 1 2 n m Prof. Amr Goneid, AUC

  11. The problem • Problem Statement: For n objects with weights wi and profits pi, obtain the set of fractions of objects xi which will maximize the total profit without exceeding a total weight m. • Formally: Obtain the set X = (x1 , x2 , … , xn) that will maximize 1 i  n pi xi subject to the constraints: 1 i  n wi xi  m , 0 xi  1 , 1 i  n Prof. Amr Goneid, AUC

  12. Optimal Solution • Feasible Solution: by satisfying constraints. • Optimal Solution: Feasible solution and maximizing profit. • Lemma 1: If 1 i  n wi = m then xi = 1 is optimal. • Lemma 2: An optimal solution will give 1 i  n wi xi = m Prof. Amr Goneid, AUC

  13. Greedy Algorithm • To maximize profit, choose highest p first. • Also choose highest x , i.e., smallest w first. • In other words, let us define the “value” of an object (i) to be the ratio vi = pi/wi and so we choose first the object with the highest vi value. Prof. Amr Goneid, AUC

  14. Algorithm GreedyKnapsack ( p[ ] , w[ ] , m , n ,x[ ] ) { insert indices (i) of items in a maximum heap on value vi = pi / wi ; Zero the vector x; Rem = m ; For k = 1..n { remove top of heap to get index (i); if (w[i] > Rem) then break; x[i] = 1.0 ; Rem = Rem – w[i] ; } if (k < = n ) x[i] = Rem / w[i] ; } // T(n) = O(n log n) Prof. Amr Goneid, AUC

  15. Example • n = 3 objects, m = 20 • P = (25 , 24 , 15) , W = (18 , 15 , 10), V = (1.39 , 1.6 ,1.5) • Objects in decreasing order of V are {2 , 3 , 1} • Set X = {0 ,0 ,0} and Rem = m = 20 • K = 1, Choose object i = 2: w2 < Rem, Set x2 = 1, w2 x2 = 15 , Rem = 5 • K = 2, Choose object i = 3: w3 > Rem, break; • K < n , x3 = Rem / w3 = 0.5 • Optimal solution is X = (0 , 1.0 , 0.5) , • Total profit is 1 i  n pi xi = 31.5 • Total weight is 1 i  n wi xi = m = 20 Prof. Amr Goneid, AUC

  16. 4. Optimal Merge Patterns(a) Definitions • Binary Merge Tree: A binary tree with external nodes representing entities and internal nodes representing merges of these entities. • Optimal Binary Merge Tree: The sum of paths from root to external nodes is optimal (e.g. minimum). Assuming that the node (i) contributes to the cost by pi and the path from root to such node has length Li, then optimality requires a pattern that minimizes Prof. Amr Goneid, AUC

  17. Optimal Binary Merge Tree If the items {A,B,C} contribute to the merge cost by PA , PB , PC, respectively, then the following 3 different patterns will cost: P1= 2(PA+PB)+PC P2 = PA+2(PB+PC) P3 = 2PA+PB+2PC Which of these merge patterns is optimal? Prof. Amr Goneid, AUC

  18. (b) Optimal Merging of Lists Lists{A,B,C} have lengths 30,25,10, respectively. The cost of merging two lists of lengths n,m is n+m. The following 3 different merge patterns will cost: P1= 2(30+25)+10 = 120 P2 = 30+2(25+10) = 100 P3 = 25+2(30+10) = 105 P2 is optimal so that the merge order is {{B,C},A}. Prof. Amr Goneid, AUC

  19. The Greedy Method • Insert lists and their lengths in a minimum heap of lengths. • Repeat • Remove the two lowest length lists (pi ,pj) from heap. • Merge lists with lengths (pi,pj) to form a new list with length pij = pi+ pj • Insert pij and its into the heap until all symbols are merged into one final list Prof. Amr Goneid, AUC

  20. The Greedy Method • Notice that both Lists (B : 25 elements) and (C : 10 elements) have been merged (moved) twice • List (A : 30 elements) has been merged (moved) only once. • Hence the total number of element moves is 100. • This is optimal among the other merge patterns. Prof. Amr Goneid, AUC

  21. (c) Huffman CodingTerminology • Symbol: A one-to-one representation of a single entity. • Alphabet: A finite set of symbols. • Message: A sequence of symbols. • Encoding: Translating symbols to a string of bits. • Decoding: The reverse. Prof. Amr Goneid, AUC

  22. Example: Coding Tree for 4-Symbol Alphabet (a,b,c,d) • Encoding: a 00 b 01 c 10 d 11 • Decoding: 0110001100 b c a d a • This is fixed length coding abcd 1 0 ab cd 0 1 0 1 a b c d Prof. Amr Goneid, AUC

  23. Coding Efficiency & Redundancy • Li =Length of path from root to symbol (i) = no. of bits representing that symbol. • Pi = probability of occurrence of symbol (i) in message. • n = size of alphabet. • < L > = Average Symbol Length = 1 i  n Pi Li bits/symbol (bps) • For fixed length coding, Li = L = constant, < L > = L (bps) • Is this optimal (minimum) ? Not necessarily. Prof. Amr Goneid, AUC

  24. Coding Efficiency & Redundancy • The absolute minimum < L > in a message is called the Entropy. • The concept of entropy as a measure of the average content of information in a message has been introduced by Claude Shannon (1948). Prof. Amr Goneid, AUC

  25. Coding Efficiency & Redundancy • Shannon's entropy represents an absolute limit on the best possible lossless compression of any communication. It is computed as: Prof. Amr Goneid, AUC

  26. Coding Efficiency & Redundancy • Coding Efficiency:  = H / < L > 0    1 • Coding Redundancy: R = 1 -  0  R  1 Actual <L> Optimal <L> H Perfect <L> Prof. Amr Goneid, AUC

  27. Example: Fixed Length Coding • 4- Symbol Alphabet (a,b,c,d). All symbols have the same length L = 2 bits • Message : abbcaada • < L > = 2 (bps) H = 1.75 Prof. Amr Goneid, AUC

  28. Example • Entropy H = 0.5 + 0.5 + 0.375 + 0.375 = 1.75 (bps), • Coding Efficiency  = H / < L > = 1.75 / 2 = 0.875, • Coding Redundancy R = 1 – 0.875 = 0.125 • This is not optimal Prof. Amr Goneid, AUC

  29. Result Fixed length coding is optimal (perfect) only when all symbol probabilities are equal. To prove this: With n = 2m symbols, L = m bits and <L> = m (bps). If all probabilities are equal, Prof. Amr Goneid, AUC

  30. Variable Length Coding(Huffman Coding) The problem: • Given a set of symbols and their probabilities • Find a set of binary codewords that minimize the average length of the symbols Prof. Amr Goneid, AUC

  31. Variable Length Coding(Huffman Coding) Formally: • Input: A message M(A,P) with a symbol alphabet A = {a1,a2,…,an} of size (n) a set of probabilities for the symbols P = {p1,p2,….pn} • Output: A set of binary codewords C = {c1,c2,….cn} with bit lengths L = {L1,L2,….Ln} • Condition: Prof. Amr Goneid, AUC

  32. Variable Length Coding(Huffman Coding) • To achieve optimality, we use optimal binary merge trees to code symbols of unequal probabilities. • Huffman Coding: More frequent symbols occur nearer to the root ( shorter code lengths), less frequent symbols occur at deeper levels (longer code lengths). Prof. Amr Goneid, AUC

  33. The Greedy Method • Store each symbol in a parentless node of a binary tree. • Insert symbols and their probabilities in a minimum heap of probabilities. • Repeat • Remove lowest two probabilities (pi ,pj) from heap. • Merge symbols with (pi,pj) to form a new symbol (aiaj) with probability pij = pi+ pj • Store symbol (aiaj) in a parentless node with two children ai and aj • Insert pij and its symbols into the heap until all symbols are merged into one final alphabet (root) • Trace path from root to each leaf (symbol) to form the bit string for that symbol. Concatenate “0” for a left branch, and “1” for a right branch. Prof. Amr Goneid, AUC

  34. Example (1): • 4- Symbol Alphabet A = {a, b, c, d} of size (4). • Message M(A,P) : abbcaada, P = {0.5, 0.25, 0.125, 0.125} • H = 1.75 Prof. Amr Goneid, AUC

  35. Building The Optimal Merge Table Prof. Amr Goneid, AUC

  36. Optimal Merge Tree for Example(1) Example: a (50%), b (25%), c (12.5%), d (12.5%) a b c d Prof. Amr Goneid, AUC

  37. Optimal Merge Tree for Example(1) Example: a (50%), b (25%), c (12.5%), d (12.5%) cd 1 0 a b c d Prof. Amr Goneid, AUC

  38. Optimal Merge Tree for Example(1) Example: a (50%), b (25%), c (12.5%), d (12.5%) bcd 1 0 b cd 1 0 c d a Prof. Amr Goneid, AUC

  39. Optimal Merge Tree for Example(1) Example: a (50%), b (25%), c (12.5%), d (12.5%) abcd 1 0 a bcd 1 0 b cd 1 0 c d Prof. Amr Goneid, AUC

  40. Coding Efficiency for Example(1) • < L > = ( 1* 0.5 + 2 * 0.25 + 3 * 0.125 + 3 * 0.125) = 1.75 (bps) • H = 0.5 + 0.5 + 0.375 + 0.375 = 1.75 (bps), •  = H / < L > = 1.75 / 1.75 = 1.00 , R = 0.0 Notice that: Symbols exist at leaves, i.e., no symbol code is the prefix of another symbol code. This is why the method is also called “prefix coding” Prof. Amr Goneid, AUC

  41. Analysis The cost of insertion in a minimum heap is O(n logn) The repeat loop is done (n-1) times. In each iteration, the worst case removal of the least two elements is 2 logn and the insertion of the merged element is logn Hence, the complexity of the Huffman algorithm is O(n logn) Prof. Amr Goneid, AUC

  42. Example (2): • 4- Symbol Alphabet A = {a, b, c, d} of size (4). • P = {0.4, 0.25, 0.18, 0.17} • H = 1.909 Prof. Amr Goneid, AUC

  43. Example(2): Merge Table Prof. Amr Goneid, AUC

  44. Optimal Merge Tree for Example(2) cdba 0 1 cdb a 0 1 cd b 1 0 c d Prof. Amr Goneid, AUC

  45. Coding Efficiency for Example(2) a (40%), b (25%), c (18%), d (17%) <L> = 1.95 bps (Optimal) H = 1.909  = 97.9 % R = 2.1 % Coding is optimal (97.9%) but not perfect Important Result: Perfect coding ( = 100 %) can be achieved only for probability values of the form 2- m (1/2, ¼, 1/8,…etc) Prof. Amr Goneid, AUC

  46. File Compression • Variable Length Codes can be used to compress files. Symbols are initially coded using ASCII (8-bit) fixed length codes. • Steps: 1. Determine Probabilities of symbols in file. 2. Build Merge Tree (or Table) 3. Assign variable length codes to symbols. 4. Encode symbols using new codes. 5. Save coded symbols in another file together with the symbol code table. • The Compression Ratio = < L > / 8 Prof. Amr Goneid, AUC

  47. Huffman Coding Animations For examples of animations of Huffman coding, see: • http://www.cs.pitt.edu/~kirk/cs1501/animations Huffman.html • http://peter.bittner.it/tugraz/huffmancoding.html Prof. Amr Goneid, AUC

More Related