1 / 60

Computing the cube

Computing the cube. Abhinandan Das CS 632 Mar 8 2001. On the Computation of Multidimensional Aggregates. Sameet Agarwal, Rakesh Agrawal, Prasad Deshpande, Ashish Gupta, Jeffrey Naughton, Raghu Ramakrishnan & Sunita Sarawagi -- VLDB 1996.

noel
Télécharger la présentation

Computing the cube

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computing the cube Abhinandan Das CS 632 Mar 8 2001

  2. On the Computation of Multidimensional Aggregates • Sameet Agarwal, Rakesh Agrawal, Prasad Deshpande, Ashish Gupta, Jeffrey Naughton, Raghu Ramakrishnan & Sunita Sarawagi -- VLDB 1996

  3. Motivation • OLAP / Multidimensional data analysis • Eg: Transactions(Prod,Date,StoreId,Cust,Sales) • Sum of sales by: (P,SId) ; (P) ; (P,D,SId) • Computing multidimensional aggregates is a performance bottleneck • Efficient computation of several related group-bys

  4. What is a CUBE? • n-dimensional generalization of the group by operator • Group-bys corresponding to all possible subsets of a given set of attributes • Eg: SELECT P, D, C, Sum(Sales) FROM Transactions CUBE-BY P, D, C • ALL, P, D, C, PD, PC, DC, PDC

  5. Approaches • Basic group-by methods: • Sort based • Hash based • Naïve approach

  6. Possible optimizations • Smallest parent • Cache results • Amortize scans • Share sorts • Share partitions • Often contradictory • Assumption: Distributive aggregate function • sum, count, min, max ; average -- Non distributive: median

  7. Sort based methods • Algorithm PipeSort • Share-sorts Vs Smallest parent • Optimize to get minimum total cost • Cache-results & amortize-scans • Pipelining: ABCD  ABC  AB  A

  8. PipeSort • Assumption: Have an estimate of the number of distinct values for each group-by • Input: Search lattice • Graph where each vertex represents a group-by of the cube • Edge i  j if |i|=|j|+1,

  9. Search lattice: CUBE-BY ABCD

  10. Search lattice (contd...) • Each edge eij associated with two costs: • S(eij): Cost of computing j from i when i is pre-sorted • U(eij): Cost of computing j from i when i is not already sorted • Idea: If attribute order of a group-by j is a prefix of parent i, compute j without sorting (Cost S(eij)) else first sort (Cost U(eij))

  11. PipeSort (contd...) • Proceed level by level, k=0 to k=N-1 • Find best way of computing level k from level k+1 • Weighted bipartite matching: • Make k additional replicas of each node at level k+1 • Cost S(eij) on original node, U(eij) on replicas • Find minimum cost maximal matching

  12. Min cost matching

  13. Algorithm PipeSort • For level k = 0 to N-1 • Generate_plan(k+1  k) • Fix sort order of level k+1 nodes • Generate_plan(k+1 k): • Make k additional replicas of level k+1 nodes, assign appropriate edge costs • Find min-cost matching

  14. Example plan

  15. Tweaks • Aggregate and remove duplicates whilst sorting • Use partial sorting order to reduce sort costs • Eg: ABC  AC

  16. Hash based methods • Algorithm PipeHash • Can include all stated optimizations: (If memory available) • For k=N downto 0 • For each group-by g at level k+1 • Compute in 1 scan of g all children for which g is smallest parent • Save g and deallocate hash table of g

  17. PipeHash • Limited memory  Use partitioning • Optimization share-partitions: • When data is partitioned on attribute A, all group-bys containing A can be computed independently on each partition • No recombination required

  18. PipeHash: Overview • First choose smallest parent for each group-by (gives MST) • Optimize for memory constraints: • Decide what group-bys to compute together • Choice of attribute for data partitioning • Minimizing overall disk scan cost: NP-Hard!

  19. PipeHash

  20. Heuristics • Optimizations cache-results and amortize-scans favoured by choosing large subtrees of MST: Can compute multiple group-bys together • However, partitioning attribute limits subtree • Hence choose the partitioning attribute that allows choice of largest subtree

  21. Algorithm: • Worklist w=MST • While w not empty • Choose any tree T from w • T’ = select_subtree(T) • Compute_subtree(T’)

  22. Select_subtree(T) • If mem reqd by T < M, return T • Else: For any get subtree Ta • Let Pa=max # of partitions of root(T) possible if a used for partitioning • Choose a s.t. (mem reqd Ta)/Pa<M and Ta is largest subtree over all a • Add forest T-Ta to w, return Ta

  23. Compute_subtree(T’) • numParts = (mem reqd T’)* fudge_factor/M • Partition root(T’) into numParts • For each partition of root(T’) • For each node n in T’ (breadth first) • Compute all children of n in 1 scan • Release memory occupied by hash table of n

  24. Notes on PipeHash • PipeHash biased towards smallest-parent optimization • Eg: Compute BC from BCD (fig) • In practice, saving on sequential disk scans less important than reducing CPU cost of aggregation by choosing smallest parent!

  25. Overlap method • Sort based • Minimize disk accesses by overlapping computation of “cuboids” • Focus: Exploit partially matching sort orders to reduce sorting costs • Uses smallest parent optimization

  26. Sorted runs • B = (A1,A2,...Aj) ; S=(A1,...Al-1,Al+1,...Aj) • A sorted run of S in B is a maximal run of tuples in B whose ordering is consistent with the sort order in S • Eg:B=[(a,1,2),(a,1,3),(a,2,2),(b,1,3), (b,3,2),(c,3,1)] S=[(a,2),(a,3),(b,3),(b,2),(c,1)] (1st & 3rd) Sorted runs for S: [(a,2),(a,3)],[(a,2)],[(b,3)],[(b,2)] and [(c,1)]

  27. Partitions • B, S have common prefix A1,...,Al-1 • A partition of a cuboid S in B is the union of sorted runs s.t. the first (l-1) columns (ie common prefix) have the same value • Previous eg: Partitions for S in B are: [(a,2),(a,3)], [(b,3),(b,2)] & [(c,1)] • Tuples from different partitions need not be merged for aggregation • Partition is independent unit of computation

  28. Overview • Begin by sorting base cuboid • All other cuboids computed w/o re-sorting • Sort order of base cuboid determines sort order of all other cuboids • To maximize overlap across cuboid computations, reduce memory requirements of individual cuboids • Since partition is unit of computation, while computing one sorted cuboid from another, just need mem sufficient to hold a partition

  29. Overview (contd...) • When partition is computed, tuples can be pipelined to descendants; same memory used by next partition • Significant saving: PartSize << CuboidSize • Eg: Computing ABC and ABD from ABCD PartSize(ABC) = 1 PartSize(ABD)=|D|

  30. Choosing parent cuboids • Goal: Choose tree that minimizes size of partitions • Eg: Better to compute AC from ACD than ABC • Heuristic: Maximize size of common prefix

  31. Example cuboid tree

  32. Choosing overlapping cuboids • To compute a cuboid in memory, need memory = PartSize • If required memory is available, cuboid is in Partition state • Else allocate 1 memory page for the cuboid, and mark as SortRun state • Only tuples of a Partition cuboid can be pipelined to descendants

  33. Heuristics • Which cuboids to compute and in what state: Opt allocation NP-hard! • Heuristic: Traverse tree in BFS order • Intuition: • Cuboids to the left have smaller partition sizes • So require less memory

  34. Cuboid computation • For each tuple t of B (parent) • If (state==partition) process_partition(t) • Else process_sorted_run(t) • Process_partition(t): • 3 cases: • Tuple starts new partition • Tuple matches existing tuple in partition • New tuple: Insert at appropriate place

  35. Cuboid computation (contd...) • Process_sorted_run(t): • 3 cases • Tuple starts new sorted run • Tuple matches last tuple in current run • New tuple: Append tuple to end of current run • Cuboid in Partition state fully computed in 1 pass • Cuboid in SortRun state: Combine merge step with computation of descendant cuboids

  36. Example CUBE computation (M=25 ; 9 sorted runs of BCD, CD to merge)

  37. An array based algorithm forsimultaneous multidimensional aggregates • Yihong Zhao, Prasad Deshpande, Jeffrey Naughton -- SIGMOD ‘97

  38. ROLAP Vs MOLAP • CUBE central to OLAP operations • ROLAP: Relational OLAP systems • PipeSort, PipeHash, Overlap • MOLAP: Multidimensional OLAP systems • Store data in sparse arrays instead of relational tables

  39. MOLAP systems • Relational tuple: (jacket, K-mart, 1996, $40) • MOLAP representation: • Stores only ‘$40’ in a sparse array • Position in array encodes (jacket,K-mart,1996) • Arrays are “chunked” and compressed for efficient storage

  40. Problem • No concept of “reordering” to bring together related tuples • Order cell visits to simultaneously compute several aggregates whilst minimizing memory requirements and # of cell visits

  41. Efficient array storage: Issues • Array too large to fit in memory: Split into “chunks” that fit in memory • Even with chunking, array may be sparse: Compression needed • Standard PL technique of storing arrays in row/column major form inefficient • Creates asymmetry amongst dimensions, favoring one over the other

  42. Chunking • Divide n-D array into small n-D chunks and store each chunk as independent object on disk • Keep size of chunk = disk block size • We shall use chunks having same size along each dimension

  43. Compressing sparse arrays • Store “dense” chunks as is (>40% occ.) • Already a significant compression over a relational table • Sparse arrays: Use chunk-offset compression – (offsetInChunk,data) • Better than LZW etc. because: • Uses domain knowledge • LZW data requires decompression before use

  44. Loading arrays from tables • Input: Table, dim sizes, chunksize • If M < array size, partition sets of chunks into groups which fit in memory eg: Suppose 16 chunks and 2 partitions, group chunks 0-7 & 8-16 • Scan table. For each tuple, calculate & store (chunk#,offset,data) into buffer page for corresponding partition • 2nd pass: For each partition, read tuples and assign to chunks in memory. Compress.

  45. Basic algo (No overlapping) • Eg: 3-D array ABC; To compute AB If array fits in memory, sweep plane of size |A|*|B| along dim C, aggregating as you go • If array chunked: Sweep plane of size |Ac|*|Bc| through upper left chunks. Store result, move to chunks on right • Each chunk read in only once • Mem: 1 chunk + |Ac|*|Bc| plane

  46. Generalization • Sweep k-1 dimensional subarrays through a k-dimensional array • Multiple group-bys: Use smallest parent optimization in cube lattice • Advantage over ROLAP: Since dimension & chunk sizes known, exact node sizes can be computed • Min Size Spanning Tree (MSST): Parent of node n is parent node n’ in lattice of min size

  47. Basic array cubing algorithm: • First construct MSST of the group-bys • Compute a group-by Di1Di2...Dik from parent Di1...Di.k+1 of min size: • Read in each chunk of Di1...Di.k+1 along dimension Di.k+1 and aggregate each chunk to a chunk of Di1...Dik. Once a chunk of Di1...Dik is complete, flush and reuse mem

  48. Example • ABC – 16x16x16 array • Chunk size: 4x4x4 • Dimension order: ABC • Eg: Computing BC: Read in order 1..64 After every 4, flush chunk to disk and reuse memory

  49. 3-D array (Dim order ABC)

  50. Multi-Way algorithm • Single pass version: Assume enough memory to compute all group-bys in 1 scan • Reduce memory requirements using a special order to scan input array, called dimension order • A dimension order of the array chunks is a row major order of the chunks with n dimensions D1...Dn in some order O = (Di1,Di2,...Din)

More Related