1 / 48

Sum Selection in Arrays

Sum Selection in Arrays. Allan Grønlund Jørgensen Kvalifikationseksamen. Priority Queues Resilient to Memory Faults, with Moruz, Mølhave (WADS 07) Optimal Resilient Dictionaries, with Brodal, Fagerberg, Finocchi, Grandoni, Italiano, Moruz, Mølhave (ESA07)

luka
Télécharger la présentation

Sum Selection in Arrays

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sum Selection in Arrays Allan Grønlund Jørgensen Kvalifikationseksamen

  2. Priority Queues Resilient to Memory Faults, with Moruz, Mølhave (WADS 07) Optimal Resilient Dictionaries, with Brodal, Fagerberg, Finocchi, Grandoni, Italiano, Moruz, Mølhave (ESA07) Comparison Based Dictionaries: Fault Tolerance versus I/O Efficiency, with Brodal and Mølhave (Manuscript-ICALP08) A Linear Time Algorithm for the k Maximal Sums Problem, with Brodal (MFCS 07) Sum Selection, with Brodal. (Manuscript-ICALP08) Progress Report Fault Tolerance: Sum Selection:

  3. 42 -8 7 2 -52 42 7 -52 -8 2 34 -1 9 -50 41 1 -43 43 -51 -9

  4. Outline • Introduction • The k maximal sums problem • Length constrained k maximal sums problem • Sum selection problem • Summary and plans for the future

  5. The Maximum Sum Problem • Given array of numbers, find the largest sum -3 7 -12 1 6 -3 5 -2 (4,7,9)

  6. Kadanes Algorithm(’77) • Scan array from left and in step i update: • Largest suffix sum (Largest sum ending at A[i]) • Largest sum so far (Largest sum in A[1,…,i]) 1 7 -12 1 6 -3 5 -2 8 1 -4 4 7 7 1 9 8 1 9

  7. Outline • Introduction • The k maximal sums problem • Length constrained k maximal sums problem • Sum selection problem • Summary and plans for the future

  8. -3 7 -12 1 6 -3 5 -2 The k Maximal Sums Problem • Given array of numbers, find the k largest sums (they may overlap) • Example with k=2 9 8

  9. Goal Optimal O(n+k) time algorithm outputting the k maximal sums

  10. Main Idea(Intuition) • Build all sums and insert them into a heap ordered binary tree • Find the k largest sums using Frederickson’s heap selection algorithm(’93) in O(k) time

  11. Example(k=4) -12 1 6 -3 5 9 6 8 4 3 -3 7 -3 -8 -5 5 -11 -12 1 2 Fredericksons algorithm finds the red nodes in O(k) time (no particular order)

  12. The Iheap • It is a heap ordered binary tree • Supports insertions in amortized constant time

  13. T2 T2 T2 5 T3 T3 T3 T3 4 T4 T4 3 T4 T4 Inserting 7 in an Iheap 9 7 5 T1 5 7 4 4 7 3 3 7

  14. Main Issue • There are n(n+1)/2 = Q(n2) sums • Constructing and inserting Q(n2) sums into a heap ordered binary tree takes Q(n2) time

  15. -3 7 -12 1 6 -3 5 -2 Grouping Sums • The sums are grouped by their endpoint in the array (1,4,-7) (2,4,-4) Q4: (3,4,-11) (4,4,1)

  16. -3 7 -12 1 6 -3 5 -2 Q4: Q5: Constructing Q5 from Q4 (1,4,-7) (1,5,-1) (2,5,2) (2,4,-4) (3,5,-5) (3,4,-11) (4,4,1) (4,5,7) (5,5,6)

  17. Main Idea Continued • Represent each Q set as a heap ordered binary tree H • Combine all heaps by assembling them into one big heap using dummy infinity keys

  18. H3 H4 H5 H1 H2 The Assembled Heap

  19. Representing Q Sets: • Each set Qj is represent by a tuple < dj , Hj > • Hj is an Iheap containing all j sums from Qj • djis a number must be added to all elements • We get the following construction equation < d0, H0 > = < 0, { } > < dj+1, Hj+1 > = < dj +A[j+1], Hj {-dj}>

  20. 0 3 0 3 0 -4 Example < d0, H0 > = < 0, { } > -3 7 -12 < dj+1, Hj+1 > = < dj +A[j+1], Hj {-dj}> {-3} {4,7} {-8,-5,-12}

  21. 9 5 7 T1 9 T2 T2 5 5 T1 insert T3 T3 T3 4 7 4 T4 T4 3 3 T4 Analysis of Pair Construction • Building each pair takes amortized constant time (One insertion into Iheap) • !! But the old version disappears • Solution: Partial Persistence (Driscoll.. ‘89) Version i Version i+1

  22. H3 H5 H4 H1 H2 Resume • Build all pairs in O(n) time • Join them into a single heap in O(n) time • Use Fredericksons algorithm to get the k+n-1 largest and discard the dummies in O(n+k) time • O(n+k) time algorithm

  23. Space Reduction • Current algorithm uses O(n+k) time and additional space • The input array is considered read only • Kadanes algorithm uses O(1) additional space • Reduce the additional space usage to O(k)

  24. Higher Dimensions …….. For an m x n matrix, we get In general we get Can be reduced to 1D case.

  25. Outline • Introduction • The k maximal sums problem • Length constrained k maximal sums problem • Sum selection problem • Summary and plans for the future

  26. 12 7 -666 8 7 -6 4 -2 Length Constrained k Maximal Sums Problem • Each sum must be an aggregate of at leastl numbers and at mostunumbers • Example with l=3 and u=5 Best Valid: 13 Best: 19

  27. Goal Optimal O(n+k) time algorithm outputting the k maximal sums with length between l and u

  28. H4 H3 H5 H1 H2 First Approach • Use the same idea as before but redefine Q to match the length criteria • Constructing equation is almost identical but requires a deletion

  29. Constructing Q SetsUsing Deletions (l=3,u=6) -5 17 42 -10 0 12 -10 666 (1,7,46) (1,6,56) (2,7,51) (2,6,61) (3,7,34) (3,6,44) (4,7,-8) (4,6,2) (5,7,2)

  30. H4 H3 H5 H1 H2 Result • Same algorithm as before using the new way of constructing the next heap • Deleting an element in a heap of size n with constant time insertion takes O(log n) • O(nlog(u-l) +k) time alg.

  31. 13 1 l -1 11 j + l -1 A Better Way of Constructing the Q sets(u=8,l=4) 13+680=693 Divide into slabs of size u-l+1 For each slab build two sets of heaps: One from left (L) and one from right (R) For each index j group all sums of length between l and u ending at j+l-1 using the sets from above and two constants Example j=3 in slab 2 1+680=681 11+680=691 Slab 1 Slab 2 -5 17 42 -10 0 12 -10 11 7 7 666 0 0+693=693 -10+693=683 -10 32

  32. H4 H3 H5 H1 H2 Result • Same algorithm using the new way to group sums. • Building the L and R sets takes O(u-l) time for each slab. • O(n+k) time algorithm

  33. Outline • Introduction • The k maximal sums problem • Length constrained k maximal sums problem • Sum selection problem • Summary and plans for the future

  34. 42 -8 7 2 -52 The 15 sums in sorted order: -56 -52 -50 -43 -14 -13 -6 -4 2 7 9 29 36 38 42 Sum Selection • Given array of numbers, find the k’th largest sum • Example with k=5 42 7 -52 -8 2 34 -1 9 -50 41 1 -43 43 -51 -9 9

  35. First Solution • Use the algorithm finding the k maximal sums to find the k largest and output the smallest of these • Algorithm uses O(n+k) time. • What if is large? k

  36. Lower Bound • Reduction from the Cartesian Sum Problem (X+Y) • A lower bound of (|Y| + |Y|log(k/|Y|)) (Frederickson and Johnson ’82) Y X 7 2 -5 12 3 9 1 13 -3 8

  37. Reduction 2 12 1 -3 8 7 -5 9 13 Y X -4 113 = 117 - 4 = 12 -14 -4 117+15 10 -11 -4 11

  38. Result • An (n+nlog(k/n)) lower bound for the sum selection problem

  39. Goal Optimal O(n+nlog(k/n)) time algorithm for selecting the k’th largest sum

  40. Algorithm • Reduction to selection in sorted arrays and weight balanced search trees • Frederickson and Johnson(’82) already solved selection in n arrays in optimal O(n + nlog(k/n)) time • Adapt this algorithm such that it also works on weight balanced trees

  41. Heap ordered binary tree Each node stores B sorted elements Inserting a block of B elements takes O(B) time. Block Heap 54,49,42 39,31,25 23,22,21 24,12,7 17,13,11 10,5,1 9,6,3

  42. 720 688 676 686 WB: 675 20 WB: 668 9 BH: WB: 666 2 BH: 720 688 676 Reducing Sum Selection to Selection in Arrays and Trees Slab Divide into slabs of size k/n Each index j is associated with two data structures that together cover all sums ending at index j First data structure is all sums starting in current slab and is named WBj The second is the rest and is named BHj Example Extending within a slab Extending to new slab - a block of k/n elements is inserted to BH 42 -10 0 12 -10 11 7 2 666 0 686 0 675 668 666 54 22 10

  43. H3 H5 H4 H1 H2 Reducing Problem • One insert in tree per step and one insert in Block heap every k/n steps. • n trees of size at most k/n and n Block heaps. • Join all Block heaps together and use Frederickson to find the 4n blocks with largest minimum • n trees and O(n) sorted arrays left

  44. Result • Selection in O(n) trees and sorted arrays storing O(k) elements can be done in O(n+nlog(k/n)) time • Result is an O(n+nlog(k/n)) time algorithm.

  45. Outline • Introduction • The k maximal sums problem • Length constrained k maximal sums problem • Sum selection problem • Summary and plans for the future

  46. Summary of Results Sum Selection:

  47. Summary of Results Fault Tolerant Data Structures:

  48. Progress and Future Fault Tolerance Priority Queue Searching Dictionary I/O Eff. Search I/O Eff. Sorting Cache Oblivious Sums in Arrays k Max Sums (l,u) k Max Sums Sum Selection Selection in arb. Trees MIT Time PhD Start Qualification Exam

More Related