Computing the cube

Computing the cube Abhinandan Das CS 632 Mar 8 2001

On the Computation of Multidimensional Aggregates • Sameet Agarwal, Rakesh Agrawal, Prasad Deshpande, Ashish Gupta, Jeffrey Naughton, Raghu Ramakrishnan & Sunita Sarawagi -- VLDB 1996

Motivation • OLAP / Multidimensional data analysis • Eg: Transactions(Prod,Date,StoreId,Cust,Sales) • Sum of sales by: (P,SId) ; (P) ; (P,D,SId) • Computing multidimensional aggregates is a performance bottleneck • Efficient computation of several related group-bys

What is a CUBE? • n-dimensional generalization of the group by operator • Group-bys corresponding to all possible subsets of a given set of attributes • Eg: SELECT P, D, C, Sum(Sales) FROM Transactions CUBE-BY P, D, C • ALL, P, D, C, PD, PC, DC, PDC

Approaches • Basic group-by methods: • Sort based • Hash based • Naïve approach

Possible optimizations • Smallest parent • Cache results • Amortize scans • Share sorts • Share partitions • Often contradictory • Assumption: Distributive aggregate function • sum, count, min, max ; average -- Non distributive: median

Sort based methods • Algorithm PipeSort • Share-sorts Vs Smallest parent • Optimize to get minimum total cost • Cache-results & amortize-scans • Pipelining: ABCD  ABC  AB  A

PipeSort • Assumption: Have an estimate of the number of distinct values for each group-by • Input: Search lattice • Graph where each vertex represents a group-by of the cube • Edge i  j if |i|=|j|+1,

Search lattice: CUBE-BY ABCD

Search lattice (contd...) • Each edge eij associated with two costs: • S(eij): Cost of computing j from i when i is pre-sorted • U(eij): Cost of computing j from i when i is not already sorted • Idea: If attribute order of a group-by j is a prefix of parent i, compute j without sorting (Cost S(eij)) else first sort (Cost U(eij))

PipeSort (contd...) • Proceed level by level, k=0 to k=N-1 • Find best way of computing level k from level k+1 • Weighted bipartite matching: • Make k additional replicas of each node at level k+1 • Cost S(eij) on original node, U(eij) on replicas • Find minimum cost maximal matching

Min cost matching

Algorithm PipeSort • For level k = 0 to N-1 • Generate_plan(k+1  k) • Fix sort order of level k+1 nodes • Generate_plan(k+1 k): • Make k additional replicas of level k+1 nodes, assign appropriate edge costs • Find min-cost matching

Example plan

Tweaks • Aggregate and remove duplicates whilst sorting • Use partial sorting order to reduce sort costs • Eg: ABC  AC

Hash based methods • Algorithm PipeHash • Can include all stated optimizations: (If memory available) • For k=N downto 0 • For each group-by g at level k+1 • Compute in 1 scan of g all children for which g is smallest parent • Save g and deallocate hash table of g

PipeHash • Limited memory  Use partitioning • Optimization share-partitions: • When data is partitioned on attribute A, all group-bys containing A can be computed independently on each partition • No recombination required

PipeHash: Overview • First choose smallest parent for each group-by (gives MST) • Optimize for memory constraints: • Decide what group-bys to compute together • Choice of attribute for data partitioning • Minimizing overall disk scan cost: NP-Hard!

PipeHash

Heuristics • Optimizations cache-results and amortize-scans favoured by choosing large subtrees of MST: Can compute multiple group-bys together • However, partitioning attribute limits subtree • Hence choose the partitioning attribute that allows choice of largest subtree

Algorithm: • Worklist w=MST • While w not empty • Choose any tree T from w • T’ = select_subtree(T) • Compute_subtree(T’)

Select_subtree(T) • If mem reqd by T < M, return T • Else: For any get subtree Ta • Let Pa=max # of partitions of root(T) possible if a used for partitioning • Choose a s.t. (mem reqd Ta)/Pa<M and Ta is largest subtree over all a • Add forest T-Ta to w, return Ta

Compute_subtree(T’) • numParts = (mem reqd T’)* fudge_factor/M • Partition root(T’) into numParts • For each partition of root(T’) • For each node n in T’ (breadth first) • Compute all children of n in 1 scan • Release memory occupied by hash table of n

Notes on PipeHash • PipeHash biased towards smallest-parent optimization • Eg: Compute BC from BCD (fig) • In practice, saving on sequential disk scans less important than reducing CPU cost of aggregation by choosing smallest parent!

Overlap method • Sort based • Minimize disk accesses by overlapping computation of “cuboids” • Focus: Exploit partially matching sort orders to reduce sorting costs • Uses smallest parent optimization

Sorted runs • B = (A1,A2,...Aj) ; S=(A1,...Al-1,Al+1,...Aj) • A sorted run of S in B is a maximal run of tuples in B whose ordering is consistent with the sort order in S • Eg:B=[(a,1,2),(a,1,3),(a,2,2),(b,1,3), (b,3,2),(c,3,1)] S=[(a,2),(a,3),(b,3),(b,2),(c,1)] (1st & 3rd) Sorted runs for S: [(a,2),(a,3)],[(a,2)],[(b,3)],[(b,2)] and [(c,1)]

Partitions • B, S have common prefix A1,...,Al-1 • A partition of a cuboid S in B is the union of sorted runs s.t. the first (l-1) columns (ie common prefix) have the same value • Previous eg: Partitions for S in B are: [(a,2),(a,3)], [(b,3),(b,2)] & [(c,1)] • Tuples from different partitions need not be merged for aggregation • Partition is independent unit of computation

Overview • Begin by sorting base cuboid • All other cuboids computed w/o re-sorting • Sort order of base cuboid determines sort order of all other cuboids • To maximize overlap across cuboid computations, reduce memory requirements of individual cuboids • Since partition is unit of computation, while computing one sorted cuboid from another, just need mem sufficient to hold a partition

Overview (contd...) • When partition is computed, tuples can be pipelined to descendants; same memory used by next partition • Significant saving: PartSize << CuboidSize • Eg: Computing ABC and ABD from ABCD PartSize(ABC) = 1 PartSize(ABD)=|D|

Choosing parent cuboids • Goal: Choose tree that minimizes size of partitions • Eg: Better to compute AC from ACD than ABC • Heuristic: Maximize size of common prefix

Example cuboid tree

Choosing overlapping cuboids • To compute a cuboid in memory, need memory = PartSize • If required memory is available, cuboid is in Partition state • Else allocate 1 memory page for the cuboid, and mark as SortRun state • Only tuples of a Partition cuboid can be pipelined to descendants

Heuristics • Which cuboids to compute and in what state: Opt allocation NP-hard! • Heuristic: Traverse tree in BFS order • Intuition: • Cuboids to the left have smaller partition sizes • So require less memory

Cuboid computation • For each tuple t of B (parent) • If (state==partition) process_partition(t) • Else process_sorted_run(t) • Process_partition(t): • 3 cases: • Tuple starts new partition • Tuple matches existing tuple in partition • New tuple: Insert at appropriate place

Cuboid computation (contd...) • Process_sorted_run(t): • 3 cases • Tuple starts new sorted run • Tuple matches last tuple in current run • New tuple: Append tuple to end of current run • Cuboid in Partition state fully computed in 1 pass • Cuboid in SortRun state: Combine merge step with computation of descendant cuboids

Example CUBE computation (M=25 ; 9 sorted runs of BCD, CD to merge)

An array based algorithm forsimultaneous multidimensional aggregates • Yihong Zhao, Prasad Deshpande, Jeffrey Naughton -- SIGMOD ‘97

ROLAP Vs MOLAP • CUBE central to OLAP operations • ROLAP: Relational OLAP systems • PipeSort, PipeHash, Overlap • MOLAP: Multidimensional OLAP systems • Store data in sparse arrays instead of relational tables

MOLAP systems • Relational tuple: (jacket, K-mart, 1996, $40) • MOLAP representation: • Stores only ‘$40’ in a sparse array • Position in array encodes (jacket,K-mart,1996) • Arrays are “chunked” and compressed for efficient storage

Problem • No concept of “reordering” to bring together related tuples • Order cell visits to simultaneously compute several aggregates whilst minimizing memory requirements and # of cell visits

Efficient array storage: Issues • Array too large to fit in memory: Split into “chunks” that fit in memory • Even with chunking, array may be sparse: Compression needed • Standard PL technique of storing arrays in row/column major form inefficient • Creates asymmetry amongst dimensions, favoring one over the other

Chunking • Divide n-D array into small n-D chunks and store each chunk as independent object on disk • Keep size of chunk = disk block size • We shall use chunks having same size along each dimension

Compressing sparse arrays • Store “dense” chunks as is (>40% occ.) • Already a significant compression over a relational table • Sparse arrays: Use chunk-offset compression – (offsetInChunk,data) • Better than LZW etc. because: • Uses domain knowledge • LZW data requires decompression before use

Loading arrays from tables • Input: Table, dim sizes, chunksize • If M < array size, partition sets of chunks into groups which fit in memory eg: Suppose 16 chunks and 2 partitions, group chunks 0-7 & 8-16 • Scan table. For each tuple, calculate & store (chunk#,offset,data) into buffer page for corresponding partition • 2nd pass: For each partition, read tuples and assign to chunks in memory. Compress.

Basic algo (No overlapping) • Eg: 3-D array ABC; To compute AB If array fits in memory, sweep plane of size |A|*|B| along dim C, aggregating as you go • If array chunked: Sweep plane of size |Ac|*|Bc| through upper left chunks. Store result, move to chunks on right • Each chunk read in only once • Mem: 1 chunk + |Ac|*|Bc| plane

Generalization • Sweep k-1 dimensional subarrays through a k-dimensional array • Multiple group-bys: Use smallest parent optimization in cube lattice • Advantage over ROLAP: Since dimension & chunk sizes known, exact node sizes can be computed • Min Size Spanning Tree (MSST): Parent of node n is parent node n’ in lattice of min size

Basic array cubing algorithm: • First construct MSST of the group-bys • Compute a group-by Di1Di2...Dik from parent Di1...Di.k+1 of min size: • Read in each chunk of Di1...Di.k+1 along dimension Di.k+1 and aggregate each chunk to a chunk of Di1...Dik. Once a chunk of Di1...Dik is complete, flush and reuse mem

Example • ABC – 16x16x16 array • Chunk size: 4x4x4 • Dimension order: ABC • Eg: Computing BC: Read in order 1..64 After every 4, flush chunk to disk and reuse memory

3-D array (Dim order ABC)

Multi-Way algorithm • Single pass version: Assume enough memory to compute all group-bys in 1 scan • Reduce memory requirements using a special order to scan input array, called dimension order • A dimension order of the array chunks is a row major order of the chunks with n dimensions D1...Dn in some order O = (Di1,Di2,...Din)

Computing the cube

Computing the cube

Presentation Transcript

The Tasks Cube

The Demographics Cube

Building the cube

The Cube Activity

Solving the Rubik’s Cube

Cube Printing with a Cube

The Magic Cube

The Tasks Cube

cube

Cube

The Picture Cube

CUBE

The Puzzle Cube Project

CUBE

THE RUBIX CUBE

Cube

Cube

Cube Materialization: Full Cube, Iceberg Cube, Closed Cube, and Shell Cube

Adventures of The Cube