B+-Tree Index

B+-Tree Index Chapter 10 Modified by Donghui Zhang Nov 9, 2005

Motivation • Suppose every disk page holds 133 records. • You are given 1334 = 0.3 billion records. They occupy 1333 = 2.3 million disk pages. • You can utilize a small memory buffer of 134 pages. • You can build an index structure. • Given the key of a record, what is the minimum guaranteed number of disk I/Os to find the record?

Content • B+-tree index • Structure • Search • Insert • Delete • Bulk-loading a B+-tree

Index Entries (Direct search) Data Entries ("Sequence set") B+ Tree Structure • Insert/delete at log F N cost; keep tree height-balanced. (F = fanout, N = # leaf pages) • Minimum 50% occupancy (except for root). Each node contains d <= m <= 2d entries. The parameter d is called the order of the tree. • Supports equality and range-searches efficiently.

B+ Tree Equality Search • Search begins at root, and key comparisons direct it to a leaf. • Search for 15*… Root 30 13 17 24 39* 3* 5* 19* 20* 22* 24* 27* 38* 2* 7* 14* 16* 29* 33* 34* • Based on the search for 15*, we know it is not in the tree!

B+ Tree Range Search • Search all records whose ages are in [15,28]. • Equality search 15*. • Follow sibling pointers. Root 30 13 17 24 39* 3* 5* 19* 20* 22* 24* 27* 38* 2* 7* 14* 16* 29* 33* 34*

B+ Trees in Practice • Typical order: 100. Typical fill-factor: 67%. • average fanout = 133 • Can often hold top levels in buffer pool: • Level 1 = 1 page = 8 KB • Level 2 = 133 pages = 1 MB • Level 3 = 17,689 pages = 145 MB • Level 4 = 2,352,637 pages = 19 GB • With 1 MB buffer, can locate one record in 19 GB (or 0.3 billion records) in two I/Os!

Inserting a Data Entry into a B+ Tree • Find correct leaf L. • Put data entry onto L. • If L has enough space, done! • Else, must splitL (into L and a new node L2) • Redistribute entries evenly, copy upmiddle key. • Insert index entry pointing to L2 into parent of L. • This can happen recursively • To split index node, redistribute entries evenly, but push upmiddle key. (Contrast with leaf splits.) • Splits “grow” tree; root split increases height. • Tree growth: gets wider or one level taller at top.

Inserting 8* into Example B+ Tree • Find leaf, in the same way as the Search algorithm. • Handle overflow by splitting. Root 30 13 17 24 39* 3* 5* 19* 20* 22* 24* 27* 38* 2* 7* 14* 16* 29* 33* 34*

Entry to be inserted in parent node. (Note that 17 is pushed up and only 17 this with a leaf split.) 5 13 24 30 Inserting 8* into Example B+ Tree Entry to be inserted in parent node. (Note that 5 is s copied up and • Observe how minimum occupancy is guaranteed in both leaf and index pg splits. • Note difference between copy-upand push-up; be sure you understand the reasons for this. 5 continues to appear in the leaf.) 3* 5* 2* 7* 8* appears once in the index. Contrast

Example B+ Tree After Inserting 8* Root 17 24 5 13 30 39* 2* 3* 5* 7* 8* 19* 20* 22* 24* 27* 38* 29* 33* 34* 14* 16* • Notice that root was split, leading to increase in height. • In this example, we can avoid split by re-distributing entries; however, this is usually not done in practice.

Deleting a Data Entry from a B+ Tree • Start at root, find leaf L where entry belongs. • Remove the entry. • If L is at least half-full, done! • If L has only d-1 entries, • Try to re-distribute, borrowing from sibling (adjacent node with same parent as L). • If re-distribution fails, mergeL and sibling. • If merge occurred, must delete entry (pointing to L or sibling) from parent of L. • Merge could propagate to root, decreasing height.

Deleting 19* and 20* Root • Deleting 19* is easy. • Deleting 20* is done with re-distribution. 17 24 5 13 30 39* 2* 3* 5* 7* 8* 19* 20* 22* 24* 27* 38* 29* 33* 34* 14* 16*

After Deleting 19* and 20* • Notice, in re-distribution, how middle key is copied up. • If delete 24*… Root 17 27 5 13 30 39* 2* 3* 5* 7* 8* 22* 24* 27* 29* 38* 33* 34* 14* 16*

... And Then Deleting 24* • Must merge. • Observe `toss’ of index entry (on right), and `pull down’ of index entry (below). 30 39* 22* 27* 38* 29* 33* 34* Root 5 13 17 30 3* 39* 2* 5* 7* 8* 22* 38* 27* 33* 34* 14* 16* 29*

2* 3* 5* 7* 8* 39* 17* 18* 38* 20* 21* 22* 27* 29* 33* 34* 14* 16* Example of Non-leaf Re-distribution • Tree is shown below during deletion of 24*. • In contrast to previous example, can re-distribute entry from left child of root to right child. Root 22 30 17 20 5 13

After Re-distribution • Intuitively, entries are re-distributed by `pushingthrough’ the splitting entry in the parent node. • It suffices to re-distribute index entry with key 20; we’ve re-distributed 17 as well for illustration. Root 17 22 30 5 13 20 2* 3* 5* 7* 8* 39* 17* 18* 38* 20* 21* 22* 27* 29* 33* 34* 14* 16*

3* 6* 9* 10* 11* 12* 13* 23* 31* 36* 38* 41* 44* 4* 20* 22* 35* Bulk Loading of a B+ Tree • If we have a large collection of records, and we want to create a B+ tree on some field, doing so by repeatedly inserting records is very slow. • Bulk Loadingcan be done much more efficiently. • Initialization: Sort all data entries, insert pointer to first (leaf) page in a new (root) page. Root Sorted pages of data entries; not yet in B+ tree

Bulk Loading (Contd.) Root • Index entries for leaf pages always entered into right-most index page just above leaf level. • Assume pages in the rightmost path to have double page size. • Split when double plus one. Data entry pages 10 12 20 6 not yet in B+ tree 3* 6* 9* 10* 11* 12* 13* 23* 31* 36* 38* 41* 44* 4* 20* 22* 35* Root 12 Data entry pages not yet in B+ tree 6 20 10 23 3* 6* 9* 10* 11* 12* 13* 23* 31* 36* 38* 41* 44* 4* 20* 22* 35*

Summary of Bulk Loading • Option 1: multiple inserts. • Slow. • Does not give sequential storage of leaves. • Option 2:Bulk Loading • Has advantages for concurrency control. • Fewer I/Os during build. • Leaves will be stored sequentially (and linked, of course). • Can control “fill factor” on pages.

B+-Tree Index

B+-Tree Index

Presentation Transcript

b b b b b b

b b b b b b

B B L

B O B B ringin ’ O ysters B ack

B B B B A A A G G G G B

B,B,B, Blogging: Your Voice

B B S

H  b b

1 2 3 4 5 6 7 8 9 0 1 b a a b c b b a b c $ a a b c b b a b c $ a b c b b a b c $ b c b b a b c $

B b

CUu = b (worst case) b/2 (average case). CUn = b CUr = b CUpu = b CUps = b log (b) + b

B-b

B b

sec B cos B sin B tan B

West B&B

B&B Accommodation

b. B ONLY

Sea Ice

Sea Ice

B+-Tree Index

B+-Tree Index

Presentation Transcript

b b b b b b

b b b b b b

B B L

B O B B ringin ’ O ysters B ack

B B B B A A A G G G G B

B,B,B, Blogging: Your Voice

B B S

H  b b

1 2 3 4 5 6 7 8 9 0 1 b a a b c b b a b c $ a a b c b b a b c $ a b c b b a b c $ b c b b a b c $

B b

CUu = b (worst case) b/2 (average case). CUn = b CUr = b CUpu = b CUps = b log (b) + b

B-b

B b

sec B cos B sin B tan B

West B&amp;B

B&amp;B Accommodation

b. B ONLY

Sea Ice

Sea Ice

West B&B

B&B Accommodation