1 / 45

Algorithms and data structures

Algorithms and data structures. Balanced trees Red-Black trees Context trees B- trees. Balanced trees. Time of BST operations is proportional to height of the tree Perfectly balanced tree – for any node the size of left and right subtree are equal (with tolerance 1) .

beatricej
Télécharger la présentation

Algorithms and data structures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Algorithmsand data structures Balanced trees Red-Black trees Context trees B-trees

  2. Balanced trees • Time of BST operations isproportional to height of the tree • Perfectly balanced tree– for any node the size of left and right subtree are equal (with tolerance 1). • Perfectly balanced tree– for each node length of any path from the node to leaf could differ at most by 1. • Approximately balanced tree –for each node length of any path from the node to leaf could differ at most two times.

  3. Examples of balanced trees • AVL trees • Red-Black Trees • B-Trees

  4. Red-Black tree • Any node is black or red • NULL is black • A red node cannot have a red child • For each node any path from this node to leaf has to contain exactly the same number of black nodes (black height)

  5. Red-Black Tree (RBT) 30 • Height of the tree: h(RBT) 2lg(n+1) 25 41 15 28 35 45 33 37 8 18 27 29 NULL NULL NULL 4 10 36 40 NULL NULL NULL 26 NULL 20 NULL NULL NULL 1 NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL

  6. Balance vs RBT Any path from the node to leaf(or to the NULL) contains the same number of black nodes Cannot exist the node that has than 2 children and has any black descendant NULL NULL

  7. Properties of RBT h(RBT) 2log2(n+1) Search  O(log(n)) Min  O(log(n)) Max  O(log(n)) Succesor  O(log(n)) Predecesor  O(log(n))

  8. Node definition class NODE : data = Nonecolor = BLACKleft = Noneright = Noneparent = none

  9. Rotation y x x y C A C B A B RightRotate(T,y) LeftRotate(T,x)

  10. LeftRotate y x x y C A C B A B def LeftRotate(root, x):y = x.rightif (y==None): returnx.right = y.leftif (y.left!=None): y.left.parent = xy.parent = x.parentif x.parent==None:root = yelif x==x.parent.left:x.parent.left = yelse:x.parent.right = yy.left = xx.parent = yreturn root How does the code change for RightRotate?

  11. Insertion of the node Cannot break the black-len of path rule (4) Can break the black-len of path rule (4) Can break the lack of red children rule (3)

  12. Insertion of the node • Insert node x as a leaf • Set x.color = red • Fix-up the tree(the rule #3 can be broken) •  means the child-side of x (i.e. left or right)

  13. RBT:INS Fix-up process - 1 15 20 4 8 25 1 Case #1 15 6 10 uncle 20 4 5 x new x 8 25 1 6 10 5 Case #1: uncle(x) is red recolor grandparent(x), parent(x), uncle(x) continue fix-up from grandparent, i.e. x = grandparent(x)

  14. RBT:INS Fix-up process - 2 15 uncle 20 15 4 uncle 25 20 x 8 new x 8 1 25 4 10 6 10 Case #2 6 1 5 5 Case #2: x is same side-child as uncle(x) (i.e. both are  ) set x = parent(x) ’-rotate tree on x (on the new x and against uncle direction) Note: After this operation x will be the oposite side-child of his parent in comparison to uncle(x) vs. parent(uncle(x))

  15. RBT:INS Fix-up process - 3 15 uncle 20 8 x 25 4 10 8 x 15 20 6 4 1 10 6 25 Case #3 5 1 5 Case #3: x is the oposite side-child vs. uncle(x) (i.e. uncle(x) is ’) recolor the parent(x), grandparent(x), -rotate tree on grandparent(x). (i.e. in the uncle-direction)

  16. Fixing the treeafterbINS op. assumption: the root is black (why not?) whilex!= root and parent(x).color == Color.RED: ifuncle(x).color == Color.RED: recolor(grandparent(x), parent(x), uncle(x))#1 x = grandparent(x)#continue from grandparent else: if is__child(uncle(x)):#uncle(x) is same-child as x x = parent(x) #2’-rotate(root, x)# i.e. ’ on new x Recolor(parent(x), grandparent(x))#3 ’-rotate(root, grandparent(x))#i.e. towards uncle(x) root.color = Color.BLACK root.color = Color.BLACK# for one-element tree

  17. The node insertion – implem. def RBTInsertNode(root, x): root = BSTInsertNode(root, x) x.color = Color.RED # assumption: root.color == BLACK while x != root and x.parent.color == Color.RED: if x.parent == x.parent.parent.left: # father is a left child # so uncle is a right child fix-up_for_right-child_uncle else: fix-up_for_left-child_uncle root.color = Color.BLACK root.color = Color.BLACK return root

  18. Fix-up for right-child uncle uncle = x.parent.parent.right if GetColor(uncle) == Color.RED : x.parent.color = Color.BLACK # case 1 uncle.color = Color.BLACK x.parent.parent.color = Color.RED x = x.parent.parent else: if x == x.parent.right: x = x.parent # case 2 root = LeftRotate(root, x) x.parent.color = Color.BLACK # case 3 x.parent.parent.color = Color.RED root = RightRotate(root, x.parent.parent)

  19. Getting collor vs. NULL def GetColor(node):if node !=None:return node.colorelse:return Color.BLACK

  20. Removing of the node NULL NULL We move the additional black color to the child (or maybe children?) Problems: What if the node have any child? Could the node have a black child (chilren)?

  21. Removing of the node • Remove the node as usual • If removed node is black give additional black color to child. • If it is doubly black fix-up is required

  22. RBT: DEL fix-up process - 1 2 x brother 10 1 B A A 7 15 D C E F Case #1 10 2 15 brother (new) x 7 1 E F Case #1: if brother is red recolor nodes brother(x), parent(x), -rotate tree on parent(x) (i.e. towards x) and update brother Note: after this procedure brother(x) is black B C A A D

  23. RBT: DEL fix-up process - 2 E C E C F D F D 2 x brother 7 1 new x A B Case #2 5 9 2 7 1 A B 5 9 Case #2: if both children of brother (x) are black set brother(x).color = RED set x = parent(x)

  24. RBT: DEL fix-up process - 3 C E E F D F 2 x brother 7 1 Case #3 2 A B x 5 9 brother (new) 5 1 A C B 7 D Case #3 (... at least one child of brother(x) is red...) : if farther child (’) of brother(x) is black recolor brother(x) and -child of brother(x) (i.e. closer child of brother) ’-rotate the tree on brother(x) (i.e. against x-direction) and update brother 9 Note: after this procedure farther child of brother(x) is red

  25. RBT: DEL fix-up process - 4 E E C C D F F D 2 x brother 5 1 5 A B Case #4 3 7 2 7 x 3 1 A B STOP: i.e. x = root Case #4: if farther child of brother(x) is red Set brother(x).color = parent(x).color Set parent(x).color = BLACK Set ’-child(brother(x)).color = BLACK (i.e. farther child of brother) -rotate the tree on parent(x) (i.e. towards x) STOP i.e. x = root

  26. Implementation notes To avoid checking if child (children) != None before get color/check left/right - a function (similar to GetColor) could be defined - a guard pattern could be implemented i.e. all the None values could be replaced by a special node

  27. Augmented RBT ordinal stats. Update of sizes: def LeftRotate (root, x): ..... y.size = getsize(y); x.size = getsize(x->left) + getsize(x->right) +1return root 93 19 y 42 19 x y T=RightRotate(T,y) x 42 11 7 93 12 6 T=LeftRotate(T,x) 4 6 7 4

  28. B-tree . M . n.keys[1] n.keys[0] . D . H . . Q . T . X . n.sons[0] n.sons[2] n.sons[1] B C F G J K L N P R S V W Y Z • The node with i-1 keys has i children • i-th key is are greater (or equal) than all the keys for i-th child • i-th key is are smaller (or equal) than all the keys for i+1-th child • Each node (except for root) contains at lest T-1 keys (i.e. T sons) • Each node contain at most 2T-1 keys (i.e. 2T sons)

  29. Minimal B-tree (h=3) T - 1 T - 1 T - 1 T - 1 T - 1 T - 1 T T T T T T root 1 2 2t 2t2 1 T - 1 T - 1 T - 1 T - 1 T - 1 T - 1 T - 1 T - 1 For T = 2 we get so called „2-3-4 tree”

  30. Properties of B-trees • B-tree is perfectly balanced • The number of keys (and children) varies • All the leaves are on the same depth • Small height of the tree • Designed for minimizing the number of accesses to the storage (the root node is kept in memory) Most of keys is stored in leaves

  31. Node definition T = 5 class BNODE: isLeaf=truecntKey=0keys = Array(2*T-1, None) sons = Array(2*T, None) #position of the node in the storagethisNodeDiscPos = None#positions of data (for particular keys) #in the starage dataDiscPos = Array(2*T-1, None) def Array(size, initVal=None): return map(lambda x: initVal, range(0,size)) class DISCPOS:...

  32. Helper functions def LoadNode(nodeDiscPos) # allocation in memory + read from storage def WriteNodeToDisc(node) # writing to storage -> node.thisNodeDiscPos AllocateNode() # allocation in memory and in the storage # writing data to storage p = BNODE() #p.isLeaf = true, p.cntKey = 0 p.thisNodeDiscPos = AllocateSpaceOnDisc() WriteNodeToDisc(p) return p

  33. Search in B-tree BTreeFind(p,k): if node_contains_key(p, k): returnp elif p.isLeaf: return None else: #p is_not_leaf_and_doesnt_contain k c = get_child_of_node_that_can_contain(p, k) ptmp = LoadNode(c) ret = BTreeFind(ptmp, k) #be sure that ptmp is freed if ret!=ptmp return ret

  34. Splitting the node T = 4 keys[i-1] keys[i] p N . W sons[i] w . P . Q . R . S . T . U . V . keys[i] keys[i-1] keys[i+] p N . S . W sons[i] sons[i+1] w y . P . Q . R . . T . U . V .

  35. Splitting the root node T=4 w . P . Q . R . S . T . U . V . root keys[0] p . S . sons[0] sons[1] w y . P . Q . R . . T . U . V .

  36. Splitting the node in B-tree Split of the maximal node w, i-th child of p • Center key w of 2*T-1 keys is moved into p node (before i-th key) • Pointer to the new node z is insert into p node (before i-th child pointer) • T-1 keys from w are moved into z • T pointers from w are moved into z • The new node z should be returned (if necessary the receiver shold free the memory after ussage)

  37. B-tree: Splitting the node BTreeSplit(p, i, w):#Assumption: p!=w if we want to split the root node #the new node should be added first (above the old root)z = AllocateNode()z.isLeaf = w.isLeafz.cntKeys, w.cntKeys = T-1, T-1 for j in range(p.cntKey-1,i,-1): p.keys[j]=p.keys[j-1] #p.data[j]=p.data[j-1] for j in range(p.cntKey, i,-1): p.sons[j]=p.sons[j-1] p.keys[i] = w.keys[T-1] #p.data[i]=w.data[T-1] p.sons[i] = zp.cntSons = p.cntSons +1for j in range(0, T-1): z.keys[j] = w.keys[T+j] #z.data[j]=w.data[T-1+j] for i in range(0,T): z.sons[j] = w.sons[T+j] WriteNodeToDisc(p) WriteNodeToDisc(w) WriteNodeToDisc(z) return z

  38. T=3 . G . M . P . X . A C D E J K N O R S T U V Y Z +B . G . M . P . X . A B C D E J K N O R S T U V Y Z +Q . G . M . P . T . X . A B C D E J K N O Q R S U V Y Z

  39. T=3 . G . M . P . T . X . A B C D E J K N O Q R S U V Y Z +L . P . . G . M . . T . X . A B C D E J K L N O Q R S U V Y Z +F . P . . C . G . M . . T . X . A B D E F J K L N O Q R S U V Y Z

  40. B-tree: Insertion of the key #1 w = root if is_maximal(root):new_root = add_a_new_root(root) split_node(root, root) w = new_root #2 c = get_child_of_node_that_can_contain(w, k) if is_maximal_node(c): split_node(root, c) c = get_child_of_node_that_can_contain(w, k) if c.isLeaf: add_to_node_a_key(c, k) else: recursve_continue_#2_for_node(c)

  41. T=3 . P . . C . G . M . . T . X . A B D E F J K L N O Q R S U V Y Z -F . P . . C . G . M . . T . X . A B D E J K L N O Q R S U V Y Z -M . P . . C . G . L . . T . X . A B D E J K N O Q R S U V Y Z

  42. T=3 . P . . C . G . L . . T . X . A B D E J K N O Q R S U V Y Z . L . -S . C . G . . P . T . X . A B D E J K N O Q R S U V Y Z -S . L . . C . G . . P . T . X . A B D E J K N O Q R U V Y Z

  43. T=3 . P . . C . G . L . . T . X . A B D E J K N O Q R S U V Y Z -G . P . . C . L . . T . X . A B D E J K N O Q R S U V Y Z -D . C . L . P . T . X . A B E J K N O Q R S U V Y Z

  44. B-tree: Removing of the key #do not visit minimal nodes! w = root if w.isLeaf and node_contains_key(w , k) : remove_form_node_the_key(w, k) elif not w.isLeaf and node_contains_key(w , k) : p = get_child_preceeding_the_key(w, k) n = get_child_sucseeding_the_key(w, k) ifis_minimal_node(p) and is_minimal_node(n): new_node = merge_nodes(p, n)recursivelly_remove_key_from_node(new_node, k) else: if not is_minimal_node(p): k1= find_predecessor(p, k) recursivelly_remove_key_from_node(p, k) else:k1= find_successor(n, k) recursivelly_remove_key_from_node(n, k) replace_k_with_k1

  45. B-tree: Removing of the key elif not w.isLeaf and not node_contains_the_key(w, k) : p = get_child_of_node_that_can_contain(w, k) if is_minimal_node(p): l = get_left_brother(p) r = get_right_brother(p)if not is_minimal_node(l): move_key_from_node_to_node(l, w, p) elif not is_minimal_node(r): move_key_from_node_to_node(r, w, p) else: p = merge_nodes(p, l)continue_the_process_from(p) else: # i.e. Is w is a leaf and doesn’t contain k return

More Related