AVL trees

AVL trees An AVL tree is a BST such that for nonempty instances • the LST and RST differ in height by at most 1, and • the LST and RST are both AVL trees • Alternatively: an AVL tree is a BST where each node has balance -1, 0, or 1, • here the balance of a node N is N.right.height() - N.left.height()

AVL tree height • An AVL tree of n has height O(log n) • note: ordinary BSTs can't make this guarantee • To see this, let N(h) be the smallest number of nodes in an AVL tree of height h. • then for h > 0, N(h) = 1 + N(h-1) + N(h-2) • a simple induction shows that N(h) > ah-1, where a = 1.5 • so for any AVL tree of height h>0, n > N(h) >= ah-1, • and then h <= 1 + logan, which is O(log n)

AVL tree operations • The search operation for ordinary BSTs may be used for AVL trees • Insertion for AVL trees is ordinary BST insertion followed by rebalancing • to restore the AVL property • Deletion is similar to insertion • we’ll say little about it

AVL tree insertion • After BST insertion but before rebalancing, the possible node balances are limited • they can only be -2, -1, 0, +1, or +2 • The balances -2 and +2 may only occur along the path to the newly inserted node. • It's enough to rebalance at the lowest node with one of these two balances • doing so will automatically fix all bad balances between that node and the root

AVL rebalancing as rotation • There are 5 possible binary trees of size n=3: LL LR RL RR • Only the middle one is an AVL tree • rebalancing simply makes the other 4 look like it • For larger n, subtrees go where they need to go to preserve the binary search property

AVL rotations • Recall that rotation is performed at a node • The LL, LR, RL, and RR cases are defined in terms of the balance at this node, and at the root of its higher subtree, as shown above • We’ll see shortly why these are the only possible cases • Note that there's a symmetry between right and left

AVL rotations in Weiss • Note that book's Figure 4.40 for LL differs from ours • Weiss is simply observing that one subtree moves as a unit • Note that an inorder traversal processes the nodes and subtrees in the same order after rotation as before • Note that LR rebalancing can be expressed as an RR rebalancing followed by an LL rebalancing

Claims about balance • Balances can only change if a height does • If balance changes at a node, then it must have changed everywhere below the node • since the height must change • also, all these lower nodes must have become more imbalanced.

Further claims about balance • At the point N of rotation, the balance must have changed from +1 to +2 or -1 to -2. • All nodes between N and the new node must have had a balance change to a legal value, • and hence from 0 to +1 or -1 • After each rotation, N has same height as before insertion.

Why 1 rotation gives an AVL tree • After insertion but before rotation, the nodes off the path to the new node are balanced • and rotation doesn’t change this • Rotation fixes the balance at N • Nodes below N were balanced before rotation (since N is the lowest bad node) • they do not become unbalanced by rotation • Those above this node remain balanced • since the appropriate subtree heights are the same after rotation as they were before insertion.

Splay trees • One possibly annoying feature of AVL trees is the need to check tree heights • So what if we just rotate without checking? • The intuition behind a splay tree: the most recently accessed node moves to the root • together with the LRU property , this is likely to make future accesses more efficient. • The motion is like AVL rotations, but new roots move two levels up even in single rotations

Splay tree rotations • There are 4 cases for rotation (LL, LR, RL, and RR), just as for AVL trees. • However the word zig is usually used instead of L, and zag instead of R. • The LR and RL cases are handled by the same double rotations as for AVL trees. • The LL and RR cases are handled by making the node the root, and letting the old parent and grandparent dangle to the right.

Applying splay tree rotations • These operations are applied repeatedly (moving the node 2 levels per operation) until the new node is at the root, or within one level of the root. • In the last case, the appropriate AVL single rotation then moves the node to the root. • Deletion of a node: • move node to root • rotate the max. node of LST the to the LST's root • insert the RST as the new RST of the LST

Efficiency of splay trees • Fact: Search, insertion, and deletion have amortized time complexity O(log n) • the proof is in Chapter 11 -- we won't cover it • cf. the amortized behavior of ArrayList.add • In the examples, note how spending time on one operation makes later operations easier

M-ary search trees • An m-ary search tree is an ordered tree where • Each nonleaf node has at most M children • A nonleaf with k children has k-1 keys. If the keys are indexed from 1 through k-1 and the subtrees from 0 through k-1, then • data items less than key 1 are in subtree 0 • data items between keys j and j+1 are in subtree j • data items greater than key k-1 are in subtree k-1

B-trees • A B-tree is an M-ary search tree where • all data items are stored in leaves • in a nonleaf, key i appears in subtree i • the root may have as few as 2 children, or be a leaf • nonleaves contain from M/2 to M children • leaves contain from L/2 to L data items • all leaves are at the same depth

B-tree pragmatics • We’re actually defining a B+ tree, by requiring that all data items be in leaves • The parameter L may be chosen based on hardware concerns (cf. Weiss, p. 149) • B-tree heights are logarithmic • For the same reason as for ordinary BSTs • In practice, they have very few levels • B-trees reduce the number of levels at the cost of extra work within a level • this makes sense only if external storage is used

Number of nodes in a B+-tree • For a B+-tree, the minimum number of nodes is • at level for m=L=100 for m=L=200 • 0 1 1 • 1 2 2 • 2 100 200 • 3 5,000 20,000 • 4 250,000 2,000,000

Number of keys in a B+-tree • If the level of the leaves is level k, then the minimum number of keys in a B+-tree is • k for m=L=100 for m=L=200 • 0 1 1 • 1 98 198 • 2 4,900 19,800 • 3 245,000 1,980,000 • 4 12,250,000 198,000,000

B-tree sizes and efficiency • Because of the rapid growth of size with height, B-tree heights are effectively O(1) • and thus so are search, insertion, and deletion • assuming that M and L are not too small • But in examples, we need small M and L • this makes over/underflow much more likely • if M = 3, the tree is often called a 2-3 tree • here nonleaves can have either 2 or 3 children • note that this applies to the root as well

B-tree operations • B-tree search is the natural generalization of BST search • B-tree insertion begins with the natural generalization of BST insertion, and then deals with any overflow • Note: that in B+-trees, all insertion is into leaves • B+-tree deletion removes from a leaf, and then deals with any underflow

Handling B+-tree overflow • Leaf overflow: • try passing a key to a sibling node • otherwise split and copy a key up • Nonleaf overflow: • try passing a key to a sibling node • otherwise split and pass a key up

Handling B+-tree underflow • Leaf underflow: • try getting a key or keys from a sibling node • otherwise merge and delete the separating key from the parent • Nonleaf underflow: • try getting a key or keys from a sibling node • otherwise merge and bring down the separating key from the parent

B+-tree algorithm details • Adding or deleting the first key of a leaf requires updating a nonleaf • Passing keys among siblings requires updating the separating key in their parent • Passing keys among siblings is tried first since several keys may be passed at once • also, splitting or merging may propagate

Correctness of splitting • Splitting a node during insertion gives a legal B+-tree, since: • If the node is a leaf • overflow gives a node with L+1 keys. • if they’re split evenly, each node gets at least L/2 • If the node is a nonleaf • replace “L” and “keys” with "M" and "children“ • any new root has 1 key and 2 children, so is legal

Correctness of underflow handling • In case of underflow during deletion: • If a leaf underflows (i.e., gets <= L/2 - 1 keys) • siblings with > L/2 + 1 keys can contribute keys • siblings with fewer keys can be merged with • If a nonroot nonleaf overflows • replace “L” and “keys” with "M" and "children“ • If the root underflows (i.e., gets 0 keys) • then 1 child remains -- make it the new root

AVL trees

AVL trees

Presentation Transcript

AVL-Trees

AVL Trees

AVL Trees

AVL Trees

AVL-Trees

AVL Trees

AVL Trees

AVL Trees

AVL Trees

AVL trees

AVL Trees

AVL Trees

AVL Trees

AVL trees

AVL Trees

AVL Trees

AVL Trees

AVL Trees

AVL TREES

AVL Trees

AVL Trees

AVL Trees