1 / 20

B-Trees

B-Trees. © Dave Bockus Acknowledgements to: Dr Frederic Maire Brisbane, Queensland, AUSTRALIA for some of the material found in this presentation . Motivation. When data is too large to fit in main memory, then the number of disk accesses becomes important.

braima
Télécharger la présentation

B-Trees

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. B-Trees © Dave Bockus Acknowledgements to: Dr Frederic Maire Brisbane, Queensland, AUSTRALIA for some of the material found in this presentation

  2. Motivation • When data is too large to fit in main memory, then the number of disk accesses becomes important. • A disk access is unbelievably expensive compared to a typical computer instruction (mechanical limitations). • One disk access is worth about 200,000 instructions. • The number of disk accesses will dominate the running time.

  3. Motivation Cont.. • Secondary memory (disk) is divided into equal-sized blocks (typical sizes are 512, 2048, 4096 or 8192 bytes) • The basic I/O operation transfers the contents of one disk block to/from main memory. • Our goal is to devise a multiway search tree that will minimize file accesses (by exploiting disk block read).

  4. K1 K2 K3 K4 Etc. K < K1 T1 T2 T3 m-ary Trees • A node contains multiple keys. • Order of subtrees is based on parent node keys • If each node has m children & there are n keys then the average time taken to search the tree is logmn. K1 < K < K2

  5. Searching m-ary Trees • A generalized SOT will visit all keys in ascending order. for (i==1;i<=m-1;i++) { visit subtree to left of ki visit ki } visit subtree to right of km-1

  6. B-Trees & Efficiency • Used in Mac, NTFS, OS2 for file structure. • Allow insertion and deletion into a tree structure, based on logmn property, where m is the order of the tree. • The idea is that you leave some key spaces open. So an insert of a new key is done using available space (most cases). • Less dynamic then our typical Binary Tree • Ideal for disk based operations.

  7. Definition of a B-Tree • Def: B-tree of order m is a tree with the following properties: • The root has at least 2 children, unless it is a leaf. • No node in the tree has more then m children. • Every node except for the root and the leaves have at least m/2 children. • All leaves appear at the same level. • An internal node with k children contains exactly k-1 keys.

  8. G C I | M H J | K N | O A D | E 2-3 Trees

  9. Insertion • Insert ki into B-tree of order m. • We find the insertion point (in a leaf) by doing a search. • If there is room then enter ki. • Else, promote the middle key to the parent & split the node into nodes around the middle key. • If the splitting backs up to the root, then • Make a new root containing the middle key. • Note: the tree grows from the leaves, balance is always maintained.

  10. I | K | M L is inserted into the above tree. G | K I M C K is promoted again, this gives the new tree: H J L N | O H J L N | O A D | E G C I | M H J | K N | O A D | E Insertion Example

  11. A | B | C T1 T4 T2 T3 B A C T1 T3 T2 T4 Splitting Nodes • Middle key is promoted • Creating a new root

  12. Deletion • If the entry to be deleted is not in a leaf, swap it with its successor (or predecessor) under the natural order of the keys. Then delete the entry from the leaf. • If leaf contains more than the minimum number of entries, then one can be deleted with no further action.

  13. C Delete D A E C C A A D | E D | E Deletion Example 1 Delete C D A E Successor is promoted, Element D C is Deleted.

  14. Deletion Cont... • If the node contains the minimum number of entries, consider the two immediate siblings of the parent node: • If one of these siblings has more than the minimum number of entries, then redistribute one entry from this sibling to the parent node, and one entry from the parent to the deficient node. • This is a rotation which balances the nodes • Note: all nodes must comply with minimum entry restriction.

  15. C A D | E Deletion Example 2 C Delete A D | E D C | D C E E

  16. Deletion Cont... • If both immediate siblings have exactly the minimum number of entries, then merge the deficient node with one of the immediate sibling node and one entry from the parent node. • If this leaves the parent node with too few entries, then the process is propagated upward.

  17. G | K Delete H I M C J L N | O A D | E G | K Node is deficient Combine with parent and 1 sibling of parent M C I | J L N | O A D | E Deletion Example 3 G | K I M C H J L N | O A D | E

  18. Node is now deficient Deficient node is combined with 1 key from parent and sibling of parent G G | K K | M C M C I | J L N | O A D | E I | J L N | O A D | E Deletion Example 3 Cont.. Node G is legal so propagation up the tree stops.

  19. Review of Deletions • All Deletions take place in leaf nodes • To delete a internal key swap it with its successor or predecessor which is a leaf. • Then Delete • Deficient Nodes are legalized by: • Rotation with a sibling and parent. OR • Combining with key from parent and sibling • Propagating up the tree until a legal node is encountered.

  20. End Notes • Studies have shown that on average there is about 1/((m/2) -1) splits per insertion. • E.g. • For a 2/3 tree there is 1 • For a 10-ary tree there is 1/4

More Related