1 / 64

Greedy algorithm

Greedy algorithm. 叶德仕 yedeshi@zju.edu.cn. Greedy algorithm’s paradigm. Algorithm is greedy if it builds up a solution in small steps it chooses a decision at each step myopically to optimize some underlying criterion Analyzing optimal greedy algorithms by showing that:

Télécharger la présentation

Greedy algorithm

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Greedy algorithm 叶德仕 yedeshi@zju.edu.cn

  2. Greedy algorithm’s paradigm • Algorithm is greedyif • it builds up a solution in small steps • it chooses a decision at each step myopically to optimize some underlying criterion • Analyzing optimal greedy algorithms by showing that: • in every step it is not worse than any other algorithm, or • every algorithm can be gradually transformed to the greedy one without hurting its quality

  3. Interval scheduling • Input: set of intervals on the line, represented by pairs of points (ends of intervals). In another word, the ith interval, starts at time si and finish at fi. • Output: finding the largest set of intervals such that none two of them overlap. Or the maximum number of intervals that without overlap. • Greedy algorithm: • Select intervals one after another using some rule

  4. Rule 1 • Select the interval which starts earliest (but not overlapping the already chosen intervals) • Underestimated solution! OPT #4 Algorithm #1

  5. Rule 2 • Select the interval which is shortest (but not overlapping the already chosen intervals) • Underestimated solution! OPT#2 Algorithm #1

  6. Rule 3 • Select the interval with the fewest conflicts with other remaining intervals (but still not overlapping the already chosen intervals) • Underestimated solution! OPT#4 Algorithm #3

  7. Rule 4 • Select the interval which ends first (but still not overlapping the already chosen intervals) • Quite a nature idea: we ensure that our resource become free as soon as possible while still satisfying one request • Hurray! Exact solution!

  8. f1 smallest Algorithm #3

  9. Analysis - exact solution • Algorithm gives non-overlapping intervals: • obvious, since we always choose an interval which does not overlap the previously chosen intervals • The solution is exact: • Let Abe the set of intervals obtained by the algorithm, • and OPTbe the largest set of pairwise non-overlapping intervals. • We show that Amust be as large as OPT

  10. Analysis – exact solution cont. • Let and be sorted. By definition of OPT we have k ≤ m • Fact: for every i ≤ k, Aifinishes not later than Bi. • Pf. by induction. • For i = 1 by definition of a step in the algorithm. • Suppose that Ai-1 finishes not later than Bi-1.

  11. Analysis con. • From the definition of a step in the algorithm we get that Aiis the first interval that finishes after Ai-1 and does not verlap it. • If Bifinished before Aithen it would overlap some of the previous A1,…, Ai-1 and • consequently - by the inductive assumption - it would overlap Bi-1, which would be a contradiction. Bi-1 Bi Ai Ai-1

  12. Analysis con. • Theorem:A is the exact solution. • Proof: we show that k = m. • Suppose to the contrary that k < m. We have that Akfinishes not later than Bk • Hence we could add Bk+1 to A and obtain bigger solution by the algorithm-contradiction Bk-1 Bk Bk+1 Ak Ak-1

  13. Time complexity • Sorting intervals according to the right-most ends • For every consecutive interval: • If the left-most end is after the right-most end of the last selected interval then we select this interval • Otherwise we skip it and go to the next interval • Time complexity: O(n log n + n) = O(n log n)

  14. Planning of schools • A collection of towns. We want to plan schools in towns. • Each school should be in a town • No one should have to travel more than 30 miles to reach one of them. Edge: towns no far than 30 miles

  15. Set cover • Input. A set of elements B, sets • Output. A selection of the Siwhose union is B. • Cost. Number of sets picked.

  16. Greedy • Greedy: first choose a set that covers the largest number of elements. • example: place a school at town a, since this covers the largest number of other towns. Greedy #4 OPT #3

  17. Upper bound • Theorem. Suppose B contains n elements that the optimal cover consist of k sets. Then the greedy algorithm will use at most k ln n sets. • Pf. Let nt be the number of elements still not covered after t iterations of the greedy algorithm (n0=n). Since these remaining elements are covered by the optimal k sets, there must be some set with at least nt /kof them. Therefore, the greedy algorithm will ensure that

  18. Upper bound con. • Then , since for all x, with equality if and only if x=0. • Thus • At t=k ln n, therefore, nt is strictly less than ne-ln n =1, which means no elements remains to be covered. • Consequently, the approximation ratio is at most ln n

  19. Exercise • Knapsack problem

  20. Marking Changes • Goal. Given currency denominations in HK: 1, 2, 5, 10, 20, 50, 100, 500, and 1000, devise a method to pay amount to customer using fewest number of notes/coins. • Cashier's algorithm. At each iteration, add note/coin of the largest value that does not take us past the amount to be paid.

  21. Optimal Offline Caching • Caching. • Cache with capacity to store k items. • Sequence of m item requests d1, d2, …, dm. • Cache hit: item already in cache when requested. • Cache miss: item not already in cache when requested: must bring requested item into cache, and evict some existing item, if full. (It also refers to the operation of bringing an item into cache.) • Goal. Eviction schedule that minimizes number of cache misses. • Ex: k = 2, initial cache = ab, requests: a, b, c, b, c, a, a, b. • Optimal eviction schedule: 2 cache misses. a a b b a b c c b b c b c c b a a b a a b b a b cache requests

  22. Optimal Offline Caching: Farthest-In-Future • Farthest-in-future. Evict item in the cache that is not requested until farthest in the future. • Theorem. [Bellady, 1960s] FF is optimal eviction schedule. • Pf. Algorithm and theorem are intuitive; proof is subtle. current cache: a b c d e f future queries: g a b c e d a b b a c d e a f a d e f g h ... eject this one cache miss

  23. Minimum spanning tree Input: weighted graph G = (V,E) • every edge in E has its positive weight Output: finding the spanning tree such that the sum of weights is not bigger than the sum of weights of any other spanning tree Spanning tree: subgraph with • no cycle, and • connected (every two nodes in V are connected by a path) 2 2 2 1 1 1 1 1 1 2 2 2 3 3 3

  24. Properties of minimum spanning trees MST Spanning trees: • n nodes • n - 1 edges • at least 2 leaves (leaf - a node with only one neighbor) MST cycle property: • After adding an edge we obtain exactly one cycle and all the edges from MST in this cycle have no bigger weight than the weight of the added edge 2 2 1 1 1 1 2 2 3 3 cycle

  25. Optimal substructures MST T: (Other edges of G are not shown.)

  26. Optimal substructures u MST T: (Other edges of G are not shown.) v Remove any edge (u, v) ∈ T.

  27. Optimal substructures T1 MST T: (Other edges of G are not shown.) T2 Remove any edge (u, v) ∈ T. Then, T is partitioned into two subtrees T1 and T2.

  28. Optimal substructures T1 MST T: (Other edges of G are not shown.) T2 Remove any edge (u, v) ∈ T. Then, T is partitioned into two subtrees T1 and T2. Theorem. The subtree T1 is an MST of G1 = (V1, E1), the subgraph of G induced by the vertices of T1: V1 = vertices of T1, E1 = { (x, y) ∈ E : x, y ∈ V1 }. Similarly for T2.

  29. Proof of optimal substructure • Proof. Cut and paste: • w(T) = w(u, v) + w(T1) + w(T2). • If T1′ were a lower-weight spanning tree than T1 for G1, then T′ = {(u, v)} ∪ T1′ ∪ T2 • would be a lower-weight spanning tree than T for G.

  30. Do we also have overlapping subproblems? • Yes. • Great, then dynamic programming may work! • Yes, but MST exhibits another powerful property which leads to an even more efficient algorithm.

  31. Crucial observation about MST Consider sets of nodes A and V - A Let F be the set of edges between A and V - A Let abe the smallest weight of an edge from F Theorem: Every MST must contain at least one edge of weight a from set F A A 2 2 1 1 1 1 2 2 3 3

  32. Proof of the observation Let e be the edge in F with the smallest weight - for simplicity assume that there is unique such edge. Suppose to the contrary that e is not in some MST. Choose one such MST. Add e to MST - obtain the cycle, where e is (among) smallest weights. Since two ends of e are in different sets A and V - A, there is another edge f in the cycle and in F. Remove f from the tree (with added edge e) - obtain a spanning tree with the smaller weight (since f has bigger weight than e). This is a contradiction with MST. A A 2 2 1 1 1 1 2 2 3 3

  33. Greedy algorithm finding MST Kruskal’s algorithm: • Sort all edges according to the weights in non-increasing order • Choose n - 1 edges one after another as follows: • If a new added edge does not create a cycle with previously selected then we keep it in (partial) MST, otherwise we remove it Remark: we always have a partial forest 2 2 2 1 1 1 1 1 1 2 2 2 3 3 3

  34. Greedy algorithm finding MST Prim’s algorithm: • Select a node as a root arbitrarily • Choose n - 1 edges one after another as follows: • Look on all edges incident to the currently build (partial) tree and which do not create a cycle in it, and select one which has the smallest weight Remark: we always have a connected partial tree root 2 2 2 1 1 1 1 1 1 2 2 2 3 3 3

  35. Example of Prim A 12 6 9 5 V - A 8 14 7 15 3 10

  36. Example of Prim A 12 6 9 5 V - A 8 14 7 15 3 10

  37. Example of Prim A 12 6 9 5 7 V - A 8 14 7 15 0 3 10

  38. Example of Prim A 12 6 9 5 7 V - A 8 14 7 15 0 3 10

  39. Example of Prim A 12 6 9 5 5 7 V - A 8 14 7 15 0 3 10

  40. Example of Prim 6 A 12 6 9 5 5 7 V - A 8 14 7 15 0 3 10

  41. Example of Prim 6 A 12 6 9 5 5 7 V - A 8 14 7 15 0 3 8 10

  42. Example of Prim 6 A 12 6 9 5 5 7 V - A 8 14 7 15 0 3 8 10

  43. Example of Prim 6 A 12 6 9 5 5 7 V - A 8 14 7 3 15 0 3 8 10

  44. Example of Prim 6 A 12 6 9 5 5 7 9 V - A 8 14 7 3 15 0 3 8 10

  45. Example of Prim 6 A 12 6 9 5 5 7 9 V - A 8 14 7 3 15 15 0 3 8 10

  46. Example of Prim 6 A 12 6 9 5 5 7 9 V - A 8 14 7 3 15 15 0 3 8 10

  47. Why the algorithms work? Follows from the crucial observation Kruskal’s algorithm: • Suppose we add edge {v,w}. This edge has the smallest weight among edges between the set of nodes already connected with v (by a path in selected subgraph) and other nodes. Prim’s algorithm: • Always chooses an edge with the smallest weight among edges between the set of already connected nodes and free nodes.

  48. Time complexity There are implementations using • Union-find data structure (Kruskal’s algorithm) • Priority queue (Prim’s algorithm) achieving time complexity O(m log n) where n is the number of nodes and m is the number of edges

  49. Best of MST • Best to date: • Karger, Klein, and Tarjan [1993]. • Randomized algorithm. • O(V + E) expected time.

  50. Conclusions Greedy algorithms for finding minimum spanning tree in a graph, both in time O(m log n) : • Kruskal’s algorithm • Prim’s algorithm Remains to design the efficient data structures!

More Related