300 likes | 493 Vues
A study of Index Structures for Main Memory Database Systems. -T. Lehman, M. Carey VLDB (1986) Making B+ Trees Cache Conscious in Main Memory. - J.Rao , K. Ross ACM SIGMOD (2000) Main Memory Index Structures with Fixed Size Partial Keys.
E N D
A study of Index Structures for Main Memory Database Systems. -T. Lehman, M. Carey VLDB (1986) Making B+ Trees Cache Conscious in Main Memory. - J.Rao, K. Ross ACM SIGMOD (2000) Main Memory Index Structures with Fixed Size Partial Keys. - P. Bohannon, P. McIlroy, R. Rastogi. ACM SIGMOD (2001) Main Memory Index Structures Presented By Ajit Padukone
Main Memory Index Structures Outline • Motivation • B+Trees / B-Trees • Lehman/Carey • AVL Tree • Hashing – Chained, Extendible, Linear • T-Trees
Main Memory Index Structures Outline • Rao / Ross • Cache Sensitive B+ Trees • Bohannon et. Al • Partial size key B-Trees
Main Memory Index Structures Motivation Main Memory DBMS predicted in early 1980s. Feasible due to growth of memory chip densities and speed as well as decrease of prices. The entire database is stored in main memory. B-Trees / B+Trees – focused on Disk storage. Need for Index Structures to be optimized for main memory.
Main Memory Index Structures Optimizations Efficient use of CPU/Memory. No disk accesses to minimize. Primary goal – minimize computation time. Could save pointers to tuples instead of data values in the data nodes.
B-Trees /B+Trees (Disk) Reduces number of Random Disk Accesses Used by most DBMS. Size of Nodes are adjusted according to Block/ Segment Size of Physical memory device.(Typical sizes 512 / 2048 / 4096 / 8192 Bytes) Always balanced. Main Memory Index Structures
B-Trees / B+Trees Structure Main Memory Index Structures Source: http://paprika.umw.edu/~ernie/cpsc321/10312006.html
AVL Trees Binary Tree search – very fast. Updates may cause rotations to keep tree balanced. Poor Space Utilization. Main Memory Index Structures
AVL Trees Main Memory Index Structures Source: A study of Index Structures for Main Memory Database Systems. T. Lehman, M. Carey VLDB (1986)
B-Trees B-Tree is preferred over B+Trees as no advantage in keeping data in the leaves. Good Storage utilization. Quick Searching Fast updating. Main Memory Index Structures
Main Memory Index Structures B-Trees Source: A study of Index Structures for Main Memory Database Systems. T. Lehman, M. Carey VLDB (1986)
Main Memory Index Structures Hashing – Linear • Dynamic Sized Hash table. • Node splits linearly based on criteria (not just overflow) unlike Extendible Hashing. • Controlled Splitting provides advantages.
Main Memory Index Structures Hashing – Linear Source: A study of Index Structures for Main Memory Database Systems. T. Lehman, M. Carey VLDB (1986)
Main Memory Index Structures T-Tree • Derived from AVL Trees/B Trees. • Retains binary search nature of AVL Tree, good update and storage characteristics of B Tree. Source: A study of Index Structures for Main Memory Database Systems. T. Lehman, M. Carey VLDB (1986)
Main Memory Index Structures T-Tree Terminology • Internal Nodes • Half leaf Nodes • Leaf Nodes • Greatest Lower • Bound • Least Upper Bound Source: A study of Index Structures for Main Memory Database Systems. T. Lehman, M. Carey VLDB (1986)
Main Memory Index Structures T-Tree Algorithms • Search • Insert • Delete • Rebalance. – similar • to AVL Trees Source: A study of Index Structures for Main Memory Database Systems. T. Lehman, M. Carey VLDB (1986)
Main Memory Index Structures T-Tree Algorithms • Search • Insert • Delete • Rebalance. – similar • to AVL Trees Source: A study of Index Structures for Main Memory Database Systems. T. Lehman, M. Carey VLDB (1986)
Main Memory Index Structures Performance Comparison • Insert Source: A study of Index Structures for Main Memory Database Systems. T. Lehman, M. Carey VLDB (1986)
Main Memory Index Structures Performance Comparison • Search Source: A study of Index Structures for Main Memory Database Systems. T. Lehman, M. Carey VLDB (1986)
Main Memory Index Structures Performance Comparison • Mixed Queries Source: A study of Index Structures for Main Memory Database Systems. T. Lehman, M. Carey VLDB (1986)
Main Memory Index Structures Performance Comparison • Mixed Queries Source: A study of Index Structures for Main Memory Database Systems. T. Lehman, M. Carey VLDB (1986)
Main Memory Index Structures Cache Sensitive B+ Trees • Rao and Ross in 2000. • CPU performance far out-scales Main memory performance.T-Trees do not perform too much better. • Need to optimize for Cache-sensitivity. Fit Node to Cache-line to minimize cache misses. • B+Trees with less pointers and more data in the node. • CSB Trees have poor performance with too many node splits. So node-groups are “segmented”.
Main Memory Index Structures Cache Sensitive B+ Trees Source: Making B+ Trees Cache Conscious in Main Memory.J.Rao, K. Ross ACM SIGMOD (2000)
Main Memory Index Structures Performance Comparison • Segmented • CSB Trees (CSS) Source: Making B+ Trees Cache Conscious in Main Memory.J.Rao, K. Ross ACM SIGMOD (2000)
Main Memory Index Structures Partial Key B-Trees • Bohannon et al. in 2001. • Improve over CSS B-Trees. • Speedup Key Comparisons by storing partial keys instead of complete keys • Ensure Node size is still sensitive to cache line. • Important Insight: Keys found are most similar to previously compared key or next compared key.
Main Memory Index Structures Partial Key B-Trees • Save • first bit position where the keys differ. • Save ‘l’ bits from the above position. (configurable)
Main Memory Index Structures Partial Key B-Trees • Key Comparison Technique Source :Main Memory Index Structures with Fixed Size Partial Keys. P. Bohannon, P. McIlroy, R. Rastogi. ACM SIGMOD (2001)
Main Memory Index Structures Partial Key Trees Performance Source :Main Memory Index Structures with Fixed Size Partial Keys. P. Bohannon, P. McIlroy, R. Rastogi. ACM SIGMOD (2001)
Main Memory Index Structures Questions?