270 likes | 399 Vues
This lecture provides an in-depth exploration of Binary Search Trees (BSTs), highlighting their unique structure, properties, and the operations that can be performed on them. Key algorithms for searching, inserting, finding, and deleting nodes are presented alongside practical examples. The challenges of deletion in BSTs are discussed in detail, comparing different cases such as leaf nodes and nodes with one or two children. The concepts of lazy deletion and building trees from scratch are also introduced, offering insights into efficient data management.
E N D
CSE 326: Data StructuresLecture #7Binary Search Trees Alon Halevy Spring Quarter 2001
Binary Trees A • Many algorithms are efficient and easy to program for the special case of binary trees • Binary tree is • a root • left subtree (maybe empty) • right subtree (maybe empty) B C D E F G H I J
A B C D E F Data left pointer right pointer Representation A B C D E F
Properties of Binary Trees Max # of leafs in a tree of height h = Max # of nodes in a tree of height h = A B C D E F G
Operations create destroy insert find delete Dictionary: Stores values associated with user-specified keys keys may be any (homogenous) comparable type values may be any (homogenous) type implementation: data field is a struct with two parts Search ADT: keys = values kim chi spicy cabbage kreplach tasty stuffed dough kiwi Australian fruit Dictionary & Search ADTs insert • kohlrabi • - upscale tuber find(kreplach) • kreplach • - tasty stuffed dough
Naïve Implementations Goal: fast find like sorted array, dynamic inserts/deletes like linked list
Search tree property all keys in left subtree smaller than root’s key all keys in right subtree larger than root’s key result: easy to find any given key inserts/deletes by changing links Binary Search Tree Dictionary Data Structure 8 5 11 2 6 10 12 4 7 9 14 13
Example and Counter-Example 5 8 4 8 5 18 1 7 11 2 6 10 11 7 3 4 BINARY SEARCH TREE NOT A BINARY SEARCH TREE
In Order Listing visit left subtree visit node visit right subtree 10 5 15 2 9 20 17 7 30 In order listing: 25791015172030
Finding a Node Node *& find(Comparable x, Node * root) { if (root == NULL) return root; else if (x < root->key) return find(x, root->left); else if (x > root->key) return find(x, root->right); else return root; } 10 5 15 2 9 20 17 7 30 runtime:
Insert Concept:proceed down tree as in Find; if new key not found, then insert a new node at last spot traversed void insert(Comparable x, Node * root) { assert ( root != NULL ); if (x < root->key){ if (root->left == NULL) root->left = new Node(x); else insert( x, root->left ); } else if (x > root->key){ if (root->right == NULL) root->right = new Node(x); else insert( x, root->right ); } }
BuildTree for BSTs Suppose a1, a2, …, an are inserted into an initially empty BST: • a1, a2, …, an are in increasing order • a1, a2, …, an are in decreasing order • a1 is the median of all, a2 is the median of elements less than a1, a3 is the median of elements greater than a1, etc. • data is randomly ordered
Examples of Building from Scratch • 1, 2, 3, 4, 5, 6, 7, 8, 9 • 5, 3, 7, 2, 4, 6, 8, 1, 9
Analysis of BuildTree • Worst case is O(n2) 1 + 2 + 3 + … + n = O(n2) • Average case assuming all orderings equally likely is O(n log n) • not averaging over all binary trees, rather averaging over all input sequences (inserts) • equivalently: average depth of a node is log n • proof: see Introduction to Algorithms, Cormen, Leiserson, & Rivest
Find minimum Findmaximum Bonus: FindMin/FindMax 10 5 15 2 9 20 17 7 30
Deletion 10 5 15 2 9 20 17 7 30 Why might deletion be harder than insertion?
Deletion - Leaf Case Delete(17) 10 5 15 2 9 20 17 7 30
Deletion - One Child Case Delete(15) 10 5 15 2 9 20 7 30
Deletion - Two Child Case Delete(5) 10 5 20 2 9 30 7 replace node with value guaranteed to be between the left and right subtrees: the successor Could we have used the predecessor instead?
Finding the Successor Find the next larger node in this node’s subtree. • not next larger in entire tree Node * succ(Node * root) { if (root->right == NULL) return NULL; else return min(root->right); } 10 5 15 2 9 20 17 7 30 How many children can the successor of a node have?
Predecessor Find the next smaller node in this node’s subtree. Node * pred(Node * root) { if (root->left == NULL) return NULL; else return max(root->left); } 10 5 15 2 9 20 17 7 30
Deletion - Two Child Case Delete(5) 10 5 20 2 9 30 7 always easy to delete the successor – always has either 0 or 1 children!
Delete Code void delete(Comparable x, Node *& p) { Node * q; if (p != NULL) { if (p->key < x) delete(x, p->right); else if (p->key > x) delete(x, p->left); else { /* p->key == x */ if (p->left == NULL) p = p->right; else if (p->right == NULL) p = p->left; else { q = successor(p); p->key = q->key; delete(q->key, p->right); } } } }
Lazy Deletion • Instead of physically deleting nodes, just mark them as deleted • simpler • physical deletions done in batches • some adds just flip deleted flag • extra memory for deleted flag • many lazy deletions slow finds • some operations may have to be modified (e.g., min and max) 10 5 15 2 9 20 17 7 30
Lazy Deletion Delete(17) Delete(15) Delete(5) Find(9) Find(16) Insert(5) Find(17) 10 5 15 2 9 20 17 7 30
Dictionary Implementations BST’s looking good for shallow trees, i.e. the depth D is small (log n), otherwise as bad as a linked list!
Beauty is Only (log n) Deep • Binary Search Trees are fast if they’re shallow: • e.g.: perfectly complete • e.g.: perfectly complete except the “fringe” (leafs) • any other good cases? Problems occur when one branch is much longer than the other! What matters here?