720 likes | 885 Vues
B+ tree & B tree. Extracted from Garcia Molina adapted by Leu to follow Elmasri’s Definition. B+Tree Example n=4. 35. Root. 110 130 179. 11. 3 5 11. 120 130. 180 200. 100 101 110. 150 156 179. 30 35. Sample non-leaf. 57 81 95. to keys to keys to keys to keys
E N D
B+ tree & B tree Extracted from Garcia Molina adapted by Leu to follow Elmasri’s Definition
B+Tree Example n=4 35 Root 110 130 179 11 3 5 11 120 130 180 200 100 101 110 150 156 179 30 35
Sample non-leaf 57 81 95 to keys to keys to keys to keys 57 57 < k 81 81 <k 95 >95
Sample leaf node: From non-leaf node to next leaf in sequence 57 81 95 To record with key 57 To record with key 81 To record with key 85
In textbook’s notation n=3 Leaf: Non-leaf: 30 35 30 35 30 30
Size of nodes: p pointers p -1 keys (fixed) Please note that here way or order refer to the maximum number of subtrees Some definition defines way as the maximum number of keys
Don’t want nodes to be too empty • Use at least Non-leaf: p/2 -1 keys (so p/2 tree pointers) Leaf: p/2 keys & data pointers
p=4 Full node min. node Non-leaf Leaf 120 150 180 30 3 5 11 30 35 counts even if null
B+tree rules tree of order n (1) All leaves at same lowest level (balanced tree) (2) Pointers in leaves point to records except for “sequence pointer”
(3) Number of pointers/keys for B+tree Max Max Min Min ptrs keys ptrs keys Non-leaf (non-root) n n-1 n/2 n/2- 1 Leaf (non-root) n-1 n-1 n/2 n/2 Root n n-1 1 1 Traditional definition
(3)‘ Number of pointers/keys for B+tree Max Max Min Min ptrs keys ptrsdata keys Non-leaf (non-root) P P-1 P/2 P/2- 1 Leaf (non-root) pleaf pleaf (pleaf)/2 (pleaf)/2 Root P P-1 1 1 Elmasri’s new definition pleaf-order of the leaf node p- order of the internal node
Insert into B+tree (a) simple case • space available in leaf (b) leaf overflow (c) non-leaf overflow (d) new root (e) Consider only maximum number of keys
When a node is too full • Node too full (for m way) K1,K2,…,K「m/2 -1 ,K「m/2 ,K「m/2 +1 ,…,Km • Split into two node K1,K2,…,K「m/2 -1K「m/2 K「m/2 +1 ,…,Km Replace the original node Right child of new key replicated into parent node
5 3 5 7 p=4 31 (a) Insert key = 7 11 3 5 11 30 31
B + tree with Pleaf • Splitting point is important • For a leaf node, the splitting point is j = (pleaf+ 1)/2 • For anon-leaf node, the splitting point is p/2 • refer to page 178-180 of Elmasri’s book
Deletion from B+tree (a) Simple case - no example (b) Coalesce with neighbor (sibling) (c) Re-distribute keys (d) Cases (b) or (c) at non-leaf
40 n=4 (b) Coalesce with sibling • Delete 50 40 10 30 50 10 30 40 50
When to coalesce • When the sibling has just enough keys sibling has (pleaf)/2 keys , then the combined node has (pleaf)/2 + (pleaf)/2 -1 keys, which is less than or equal to 2 * (pleaf)/2 -1 ≦ pleaf + 1 –1 = pleaf which is not too big!!!
30 40 35 n=4 (c) Redistribute keys • Delete 50 10 35 50 10 20 30 35 40 50
new root 30 30 22 30 • (d) Non-leaf coalese • Delete 37 n=4 22 3 14 26 37 25 26 30 37 1 3 10 14 20 22 40 45
B+tree deletions in practice • Often, coalescing is not implemented • Too hard and not worth it!
example A PARTS file with Pan# as key field includes records with the following Part# values: 23, 65, 37, 60, 46, 92, 48, 71, 56, 59, 18, 21, 10, 74, 78, 15, 16, 20, 24, 28, 39, 43, 47, 50, 69, 75, 8, 49, 33, 38. Suppose the search field values are inserted in the given order in a B+-tree of order p=4 and Pleaf=3; show how the tree will expand and what the final tree looks like.
solution Answer: A B+.tree of order p=4 implies that each internal node in the tree (except possibly the root) should have at least 2 keys (3 pointers) and at most 4 pointers. For Pleaf=3. leaf nodes must have at least 2 keys and at most 3 keys. The figure on page 50 shows how the tree progresses as the keys are inserted. We will only show a new tree when insertion causes a split of one of the leaf nodes. and then show how the split propagates up the tree. Hence, step 1 below shows the tree after Insertion of the first 3 keys 23, 65, and 37, and before Inserting 60 which cause;s overflow and splitting. The trees given below show how the keys are Inserted In order. Below, we give the keys Inserted for each tree: 1:23. 65, 37; 2:60; 3:46; 4:92; 6:48,71; 6:56; 7;59, 18; 8:21; 9:10; 10:74; 11:78; 12:15; 13:16; 14:20; 15:24; 16:28.39; 17:43, 47; 18:50, 69: 19:75; 20:8, 49, 33. 38;
Deletion Suppose the following search field values are deleted in the given order from the B+.tree of Exercise 5.11, show how the tree will shrink and show the final tree. The deleted values are: 65, 75, 43, 18, 20, 92, 59, 37.
Solution An important note about a delete algorithm for a B+-tree is that deletion a Key value from a leaf node will result in a reorganization of the tree If; (i) The leaf node Is less than half full; in this case, we will combine It with the next leaf node (other algorithms combine it with either the next or the previous leaf nodes, or both), (ii) If the key value deleted is the rightmost (last) value In the leaf node, In which case its value will appear In an Internal node; In this case, the key value to the left of the deleted key in the left node replaces the deleted key value in the internal node.
Variation on B+tree: B-tree (no +) • Idea: • Avoid duplicate keys • Have record pointers in non-leaf nodes
K1 P1 K2 P2 K3 P3 to record to record to record with K1 with K2 with K3 to keys to keys to keys to keys < K1 K1<x<K2 K2<x<k3 >k3
sequence pointers • not useful now! • (but keep space for simplicity) B-tree example p=3, max. subtrees 65 125 25 45 85 105 145 165 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180
– 20 – • Afterwards: 10 25 30 Note on inserts 10 20 30 • Say we insert record with key = 25 p=4 leaf
So at most p –1keys So, for B-trees: So at least p/2 - 1 keys • Each node has at most p tree pointers • Each node, except the root, has at least p/2 tree pointers • The root node has at least two tree pointers, unless it is the only node in the tree • All leaf nodes are at the same level. Leaf node has the same structure as internal nodes except that all of their tree pointer Pi are null
Insertion Criterion • Insert at the failure node, by searching the tree • Insert at the right place, if the node becomes too full, that is, has p keys in it, then split • To split, take the key at p/2 as the splitting point, take the k p/2 out, and insert it into its parent • Splitting may propagate to the root
example • Build a B-tree of order p =3. The values are inserted in the order 8, 5, 1, 7, 3, 12, 9, 6
More example • Try p = 5 with the following key sequence 23, 65, 37, 60, 46, 92, 48, 71, 56, 59, 18, 21, 10, 74, 78, 15, 16, 20, 24, 28, 39 • Note: large p implies easy solution
Solution (may be wrong!) 28 16,21 46,65 71,74,78,92 23,24 37,39 10, 15 18,20 48,56,59,60