1 / 25

Lec . 17

Lec . 17. Introduction to Multilevel Indexing and B-Trees. Lecture Overview. Introduction to multilevel indexing and B-trees Insertions in B-trees. Motivations. Problems with simple indexes that are kept in disk Seeking the index is still slow (binary searching) We don't want more than

carmean
Télécharger la présentation

Lec . 17

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lec. 17 Introduction to Multilevel Indexing and B-Trees

  2. Lecture Overview • Introduction to multilevel indexing and B-trees • Insertions in B-trees

  3. Motivations Problems with simple indexes that are kept in disk • Seeking the index is still slow (binary searching) • We don't want more than 3 or 4 seeks for a search • So, here log2(N+1) is still slow • Insertions and deletions should be as fast as searches • In simple indexes, insertion or deletion take O(n) disk accesses (since index should be kept sorted) because of shifting

  4. Solution 1 A sorted list can be expressed in a Binary Search Tree representation.

  5. Binary search Tree Too much seeks • we no longer have to sort index • Worst case searchwith a balancedbinary tree is log2 (N + 1) compares. • Worst case search with an unbalancedbinary tree is near to O(N)compares. balanced BST Unbalanced BST

  6. Solution 2

  7. AVL Trees (Georgy Adelson-Velsky and Evgenii Landis' tree, named after the inventors) • The AVL tree is self-balancing binary search tree. • In AVL tree:the heights of the two child subtrees of any node differ by at most one Still Too much seeks

  8. Solution 3

  9. Multilevel Indexing • Build an index of an index file • How • Build a simple index for the file, sorting keys using the method for external sorting previously studied. • Build an index for this index • Build another index for the previous index, and so on.

  10. Multilevel Indexing

  11. Multilevel Indexing

  12. Multilevel Indexing While multi-record multi-level indexes really help reduce the number of disk accesses and their overhead space costs are minimal, inserting a new key or deleting an old one is very costly. Each record has 100 keys <PK, offset> Has 8 keys <PK, offset> The problem of indexes in insertion and deletion (shifting)

  13. Solution 4

  14. B-Trees: An Overview • B-trees do not need re-balancing (self-balancing tree) • B-Trees are multi-level indexes that solve the problem of linear cost of insertion and deletion. • B-Trees are now the standard way to represent indexes.

  15. Formal Definition of B-Tree Properties مهم جدا • In a B-Tree of order m, • Every page has a maximum of m descendants • Every page, except for the root and leaves, has at least m/2 descendants. • The root has at least two descendants (unless it is a leaf). • All the leaves appear on the same level. • The leaf level forms a complete, ordered index of the associated data file. • B tree is a self-balancing tree

  16. Example of b tree of order (5)

  17. The minimum number of keys in a node is M/2

  18. Insertion in B-Tree هام جدا • To insert a new element, search for the leaf node where the new element should be added. • If the node contains fewer elements than the maximum capacity. Insert this element into it in order. • If the node is full, splitit into to nodes such that • A single median is chosen from among the leaf's elements and the new element. • Values less than the median are put in the new left node and values greater than the median are put in the new right node, with the median acting as a separation value. • The separation value is inserted in the node's parent, which may cause it to be split, and so on. If the node has no parent (i.e., the node was the root), create a new root above this node (increasing the height of the tree).

  19. Deletion in B-Tree هام جدا Deletion from a node • Search for the value to delete • If the value is in a leaf node, simply delete it from the node. • If underflow happens, re-balance the tree as described in "Re-balancing after deletion". Rebalance after deletion • Let's call the node containing the deleted element as “target node” • If the leftsibling of the target node has more than the minimum number of elements, move the largest element from the left sibling to the target node. Update the parent if needed. The tree is now balanced. • Else if the right sibling of the target node has more than the minimum number of elements, do the above step with the right sibling. The tree is now balanced. • If both left and right siblings contain exactly the minimum number of elements. • If there is a left sibling, move all remaining elements from the target node to the left sibling. The target node now is empty and the left sibling is full. • Else, do the above step with the right sibling.

  20. Notes هام جدا • If order in a B-tree is mthen • Min number of nodes in root = 2 • Min number of nodes in any other node = • In case of overflow the you should propagate median to parent node, in his case Median=

More Related