Create Presentation
Download Presentation

Download Presentation
## Lecture 39: Greek Tragedy & Balanced Trees

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**CSC 213 – Large Scale Programming**Lecture 39: Greek Tragedy & Balanced Trees**Today’s Goals**• Review a new search tree algorithm is needed • What real-world problems occur with old tree? • Why does garbage collection make problem worse? • What was ideal approach? How could we force this? • Consider how to create other search tree types • Not limit nodes to 1 element & what could happen? • How to perform insertions on multi-nodes? • What about withdrawal? How can we remove data? • Can this sound dirtier? And do I hear banjos playing?**Dictionary ADT**• Dictionary and Mapmaps keys to values • O(1) time with hash, but only if hash is good • Can guarantee better -- O(logn) with balanced BST • Assumes data fits in memory since locality will suck • But, honestly, how big can a tree be?**Dictionary ADT**• Dictionary and Mapmaps keys to values • O(1) time with hash, but only if hash is good • Can guarantee better -- O(logn) with balanced BST • Assumes data fits in memory since locality will suck • But, honestly, how big can a tree be? • Library of Congress – 20 TB in text database • Amazon.com – 42 TB of combined data • ChoicePoint – 250 TB of data on everyday Americans • World Data Center for Climate – 4 PB of climate data**Dictionary ADT**• Dictionary and Mapmaps keys to values • O(1) time with hash, but only if hash is good • Can guarantee better -- O(logn) with balanced BST • Assumes data fits in memory since locality will suck • But, honestly, how big can a tree be? • Library of Congress – 20 TB in text database • Amazon.com – 42 TB of combined data • ChoicePoint – 250 TB of data on everyday Americans • World Data Center for Climate – 4 PB of climate data (Numbers gathered from Feb. 2007 article)**Optimal Tree Partition**But no GC algorithm produces this!**Real-World Big Search Trees**• Excellent way to test roommatessystem**Real-World Big Search Trees**• Excellent way to test roommatessystem**Real-World Big Search Trees**• Excellent way to test roommatessystem**(a,b) Trees to the Rescue!**• General solution to frequent hikes to Germany • Linux & MacOS to track files & directories • MySQL & other databases use this to hold all the data • Found in many other places where paging occurs • Simple rules define working of any (a,b) tree • Grows upward so that all leaves found at same level • At leasta children for each internal node • Every internal node has at mostb children**What is “the BTree?”**• Common multi-way tree implementation • Describe B-Tree using order (“BTree of order m”) • m/2 to m children per internal node • Root node can have m or fewer elements • Many variants existto improve some failing • Each variant is specialized for some niche use • Minor differences only between each variant • Will just describe most basic B-Tree during lecture**BTree Order**• Select order minimizing paging when created • Elements & references to kids in full node fills page • Nodes have at least m/2 elements, even at their smallest • In memory guarantees each page is at least 50% full • How many pages touched by each operation?**Multi-Way Search Tree**• Nodes contain multiple elements • Tree grows up with leaves always at same level • Each internal node: • At least 2 children • 1fewer Entrys than children • Entrys sorted from smallest to largest 11 24 2 6 8 15 27 30**Multi-Way Search Tree**• Children v1v2v3 … vd& keys k1k2 … kd-1 • Keys in subtreev1smaller than k1 • Keys in subtreevibetweenki-1andk2 • Keys in subtreevdgreater than kd-1 1124 2 6 8 15 27 30**Inorder Traversal**• Visit each child, vi , before visiting Entryei • As with BST, visits keys in increasing order 11 24 6 4 2 6 8 15 27 30 8 1 2 3 5 7**Multi-Way Searching**• Similar to BST treeSearch finding a key fori = 0tonumChildren – 1 doifk < e[i].getKey()thenreturn search(child[i])ifk == e[i].getKey()thenreturn e[i] endfor ifk > e[e.length-1].getKey()thenreturn search(child[child.length-1]) 11 24 2 6 8 15 27 30**Multi-Way Searching**fori = 0tonumChildren – 1 doifk < e[i].getKey()thenreturn search(child[i])ifk == e[i].getKey()thenreturn e[i] endfor ifk > e[e.length-1].getKey()thenreturn search(child[child.length-1]) • Example: find(8) 11 24 2 6 8 15 27 30**Multi-Way Searching**fori = 0tonumChildren – 1 doifk < e[i].getKey()thenreturn search(child[i])ifk == e[i].getKey()thenreturn e[i] endfor ifk > e[e.length-1].getKey()thenreturn search(child[child.length-1]) • Example: find(8) 11 24 2 6 8 15 27 30**Multi-Way Searching**fori = 0tonumChildren – 1 doifk < e[i].getKey()thenreturn search(child[i])ifk == e[i].getKey()thenreturn e[i] endfor ifk > e[e.length-1].getKey()thenreturn search(child[child.length-1]) • Example: find(8) 11 24 2 6 8 15 27 30**Multi-Way Searching**fori = 0tonumChildren – 1 doifk < e[i].getKey()thenreturn search(child[i])ifk == e[i].getKey()thenreturn e[i] endfor ifk > e[e.length-1].getKey()thenreturn search(child[child.length-1]) • Example: find(8) 11 24 2 6 8 15 27 30**Multi-Way Searching**fori = 0tonumChildren – 1 doifk < e[i].getKey()thenreturn search(child[i])ifk == e[i].getKey()thenreturn e[i] endfor ifk > e[e.length-1].getKey()thenreturn search(child[child.length-1]) • Example: find(8) 11 24 2 6 8 15 27 30**Multi-Way Searching**fori = 0tonumChildren – 1 doifk < e[i].getKey()thenreturn search(child[i])ifk == e[i].getKey()thenreturn e[i] endfor ifk > e[e.length-1].getKey()thenreturn search(child[child.length-1]) • Example: find(8) 11 24 2 6 8 15 27 30**Multi-Way Searching**fori = 0tonumChildren – 1 doifk < e[i].getKey()thenreturn search(child[i])ifk == e[i].getKey()thenreturn e[i] endfor ifk > e[e.length-1].getKey()thenreturn search(child[child.length-1]) • Example: find(8) 11 24 268 15 27 30**Multi-Way Searching**fori = 0tonumChildren – 1 doifk < e[i].getKey()thenreturn search(child[i])ifk == e[i].getKey()thenreturn e[i] endfor ifk > e[e.length-1].getKey()thenreturn search(child[child.length-1]) • Example: find(8) 11 24 268 15 27 30**Multi-Way Searching**fori = 0tonumChildren – 1 doifk < e[i].getKey()thenreturn search(child[i])ifk == e[i].getKey()thenreturn e[i] endfor ifk > e[e.length-1].getKey()thenreturn search(child[child.length-1]) • Example: find(8) 11 24 268 15 27 30**(2,4) Trees**• Multi-way search treewith 2 properties: • Node-Size Property Internal nodes have at most 4 children • Depth PropertyAll external nodes at same depth • Nodes are either 2-node, 3-node or 4-node • Node’s number of childrenused as basis for name 10 15 24 2 8 12 18 27 32**Insertion**• Start by searchingfor key k • Entryadded to lastinternal node searched • Depth property preserved by enforcing this • Example: insert(30) 10 15 24 2 8 12 18 27 32**Insertion**• Start by searchingfor key k • Entryadded to last internal node searched • Depth property preserved by enforcing this • Example: insert(30) 10 15 24 2 8 12 18 27 32**Insertion**• Start by searchingfor key k • Entryadded to last internal node searched • Depth property preserved by enforcing this • Example: insert(30) 1015 24 2 8 12 18 27 32**Insertion**• Start by searchingfor key k • Entryadded to last internal node searched • Depth property preserved by enforcing this • Example: insert(30) 101524 2 8 12 18 27 32**Insertion**• Start by searchingfor key k • Entryadded to last internal node searched • Depth property preserved by enforcing this • Example: insert(30) 101524 2 8 12 18 27 32**Insertion**• Start by searchingfor key k • Entryadded to last internal node searched • Depth property preserved by enforcing this • Example: insert(30) 101524 2 8 12 18 27 32**Insertion**• Start by searchingfor key k • Entryadded to last internal node searched • Depth property preserved by enforcing this • Example: insert(30) 101524 2 8 12 18 27 32**Insertion**• Start by searchingfor key k • Entryadded to last internal node searched • Depth property preserved by enforcing this • Example: insert(30) 101524 2 8 12 18 2732**Insertion**• Start by searchingfor key k • Entryadded to last internal node searched • Depth property preserved by enforcing this • Example: insert(30) 101524 2 8 12 18 2732**Insertion**• Start by searchingfor key k • Entryadded to last internal node searched • Depth property preserved by enforcing this • Example: insert(30) 101524 2 8 12 18 2732**Insertion**• Start by searchingfor key k • Entryadded to last internal node searched • Depth property preserved by enforcing this • Example: insert(30) 101524 2 8 12 18 2732**Insertion**• Start by searchingfor key k • Entryadded to last internal node searched • Depth property preserved by enforcing this • Example: insert(30) 101524 2 8 12 18 273032**Insertion**• Insertion may cause overflow! • 5-node created by the insertion • This would make it violateNode-Size property 15 24 12 18 27 32 35**Insertion**• Insertion may cause overflow! • 5-node created by the insertion • This would make it violateNode-Size property 15 24 12 18 27303235**In Case Of Overflow Split Node**• Split 5-node into 2 new nodes • Entryse1e2& children v1v2v3 become a 3-node • 2-node created with Entry e4& children v4v5 15 24 12 18 27 30 32 35**In Case Of Overflow Split Node**• Split 5-node into 2 new nodes • Entryse1e2& children v1v2v3 become a 3-node • 2-node created with Entry e4& children v4v5 • Promote e3to parent node • If overflow occurs in root node, create new root • Overflow can cascade when parent already was 4-node 15 24 32 15 24 12 18 27 30 3235 12 18 27 30 35**Parent Overflow**• In case of cascade, repeat overflow process • Works identically to when children are external • Example: insert(29) 15 24 26 12 18 25 27 32 35**Parent Overflow**• In case of cascade, repeat overflow process • Works identically to when children are external • Example: insert(29) 15 24 26 12 18 25 27 32 35**Parent Overflow**• In case of cascade, repeat overflow process • Works identically to when children are external • Example: insert(29) 15 24 26 12 18 25 27 32 35**Parent Overflow**• In case of cascade, repeat overflow process • Works identically to when children are external • Example: insert(29) 15 24 26 12 18 25 27 32 35**Parent Overflow**• In case of cascade, repeat overflow process • Works identically to when children are external • Example: insert(29) 15 24 26 12 18 25 27 29 32 35**Parent Overflow**• In case of cascade, repeat overflow process • Works identically to when children are external • Example: insert(29) 15 24 26 12 18 25 27 29 32 35**Parent Overflow**• In case of cascade, repeat overflow process • Works identically to when children are external • Example: insert(29) 15 24 26 12 18 25 27 29 32 35