Hashing, B - Trees and Red – Black Trees using Parallel Algorithms & Sequential Algorithms

Hashing, B - Trees and Red – Black Trees using Parallel Algorithms & Sequential Algorithms By Yazeed K. Almarshoud

Road map • Introduction. • Definitions. • Hashing. • B Trees • Red Black Trees. • Sequential algorithms • Parallel algorithms

Introduction • In this presentation I ‘m glad to present to you the importance of parallel computations and how it is a form of computation in which many calculations are carried out simultaneously, operating on the principle that large problems can often be divided into smaller ones which are then solved concurrently

Hashing

Definitions • Hashing: A hash function is any well – defined procedure or mathematical function which converts a large, possibly variable-sized amount of data into a small datum, usually a single integer that may serve as an index into an array. • i.e. keys made up for alphabetical characters could be replaced by their ASCII equivalents. • Two standards hashing techniques: • Division methods • Multiplication method.

Symbol – table problem

Hash functions

Choosing a hash function The assumption of simple uniform hashing is hard to guarantee, but several common techniques tend to work well in practice as long as their deficiencies can be avoided. • Desirata: • A good hash function should distribute the keys uniformly into the slots of the table. • Regularity in the key distribution should not affect this uniformity.

Division method • Assume all keys are integers, and define h(k) = k mod m. • Deficiency: Don’t pick an m that has a small divisor d. A preponderance of keys that are congruent modulo d can adversely affect uniformity. • Extreme deficiency: If m = 2r, then the hash doesn’t even depend on all the bits of k: If k = 10110001110110102 and r = 6, then h(k) = 0110102 .

Division method (continued) • h(k) = k mod m. Pick m to be a prime not too close to a power of 2 or 10 and not otherwise used prominently in the computing environment. • Annoyance: Sometimes, making the table size a prime is inconvenient. But, this method is popular, although the next method we’ll see is usually superior.

Multiplication method • Assume that all keys are integers, m = 2r, and our computer has w-bit words. Define h(k) = (A·k mod 2w) rsh (w – r), where rsh is the “bit-wise right-shift” operator and A is an odd integer in the range 2w–1 < A < 2w. • Don’t pick A too close to 2w. • Multiplication modulo 2wis fast. • The rsh operator is fast.

Multiplication methodexample

Resolving collisions bychaining

Analysis of chaining

Search cost

Resolving collisions by open addressing • No storage is used outside of the hash table itself.. • The hash function depends on both the key and probe number: • h : U X {0, 1, …, m–1}  {0, 1, …, m–1}. • E.g. h(k) = (k+i) mod m ; h(k) = (k+i2 ) mod m • Inserting a key k: • we check T[h(k,0)]. If empty we insert k, there. Otherwise, • we check T[h(k,1)]. If empty we insert k, there. Otherwise,… • otherwise etc for h(k,2), h(k,1), …, h(k,m–1). • Finding a key k: • we check if T[h(k,0)] is empty, and if =k. If not • we check if T[h(k,1)] is empty, and if =k. If not • otherwise etc for h(k,2), h(k,1), …, h(k,m–1). • Deleting a key k • Find it are replace with a dummy (why)

Example of open addressing

Probing strategies • Linear probing: • Given an ordinary hash function h ‘(k), linear probing uses the hash function h(k,i) = (h’(k) + i) mod m. • This method, though simple, suffers from primary clustering, where long runs of occupied slots build up, increasing the average search time. Moreover, the long runs of occupied slots tend to get longer.

Red-Black Trees 6 v 8 3 z 4

Roadmap • Definition • Height • Insertion • restructuring • recoloring • Deletion • restructuring • recoloring • adjustment

Red-Black Tree • A red-black tree can also be defined as a binary search tree that satisfies the following properties: • Root Property: the root is black • External Property: every leaf is black • Internal Property: the children of a red node are black • Depth Property: all the leaves have the same black depth 9 4 15 21 2 6 12 7

Height of a Red-Black Tree • Theorem: A red-black tree storing n items has height O(log n) • The search algorithm for a red-black search tree is the same as that for a binary search tree • By the above theorem, searching in a red-black tree takes O(log n) time

To perform operation insertItem(k, o), we execute the insertion algorithm for binary search trees and color red the newly inserted node z unless it is the root We preserve the root, external, and depth properties If the parent v of z is black, we also preserve the internal property and we are done Else (v is red ) we have a double red (i.e., a violation of the internal property), which requires a reorganization of the tree Example where the insertion of 4 causes a double red: Insertion 6 6 v v 8 8 3 3 z z 4

Remedying a Double Red • Consider a double red with child z and parent v, and let w be the sibling of v Case 1: w is black • The double red is an incorrect replacement of a 4-node • Restructuring: we change the 4-node replacement Case 2: w is red • The double red corresponds to an overflow • Recoloring: we perform the equivalent of a split 4 4 w v v w 7 7 2 2 z z 6 6 4 6 7 2 4 6 7 .. 2 ..

Local invariants example: • It involves only the fields of an object and the fields of its tree-children • We specify local invariants using the repOkLocal method.

Restructuring • A restructuring remedies a child-parent double red when the parent red node has a black sibling • It is equivalent to restoring the correct replacement of a 4-node • The internal property is restored and the other properties are preserved z 6 4 v v w 7 7 2 4 z w 2 6 4 6 7 4 6 7 .. 2 .. .. 2 ..

2 6 6 4 4 2 6 2 4 Restructuring (cont.) • There are four restructuring configurations depending on whether the double red nodes are left or right children 2 6 4 4 2 6

Recoloring • A recoloring remedies a child-parent double red when the parent red node has a red sibling • The parent v and its sibling w become black and the grandparent u becomes red, unless it is the root • The double red violation may propagate to the grandparent u 4 4 v v w w 7 7 2 2 z z 6 6 … 4 … 2 4 6 7 2 6 7

Analysis of Insertion AlgorithminsertItem(k, o) 1. We search for key k to locate the insertion node z 2. We add the new item (k, o) at node z and color z red 3. whiledoubleRed(z) if isBlack(sibling(parent(z))) z  restructure(z) return else {sibling(parent(z) is red } z  recolor(z) • Recall that a red-black tree has O(log n) height • Step 1 takes O(log n) time because we visit O(log n) nodes • Step 2 takes O(1) time • Step 3 takes O(log n) time because we perform • O(log n) recolorings, each taking O(1) time, and • at most one restructuring taking O(1) time • Thus, an insertion in a red-black tree takes O(log n) time

Deletion • To perform operation remove(k), we first execute the deletion algorithm for binary search trees • Let v be the internal node removed, w the external node removed, and r the sibling of w • If either v of r was red, we color r black and we are done • Else (v and r were both black) we color rdouble black, which is a violation of the internal property requiring a reorganization of the tree • Example where the deletion of 8 causes a double black: 6 6 v r 8 3 3 r w 4 4

Remedying a Double Black • The algorithm for remedying a double black node w with sibling y considers three cases Case 1: y is black and has a red child • We perform a restructuring, equivalent to a transfer , and we are done Case 2: y is black and its children are both black • We perform a recoloring, equivalent to a fusion, which may propagate up the double black violation Case 3: y is red • We perform an adjustment, equivalent to choosing a different representation of a 3-node, after which either Case 1 or Case 2 applies • Deletion in a red-black tree takes O(log n) time

Red-Black Tree Reorganization

Binary Trees 6 8 3

Binary Trees • A tree in which no node can have more than two children • The depth of an “average” binary tree is considerably smaller than N, even though in the worst case, the depth can be as large as N – 1.

Example: Expression Trees • Leaves are operands (constants or variables) • The other nodes (internal nodes) contain operators • Will not be a binary tree if some operators are not binary

Binary Trees • Possible operations on the Binary Tree ADT • parent • left_child, right_child • sibling • root, etc • Implementation • Because a binary tree has at most two children, we can keep direct pointers to them

Compare: Implementation of a general tree

Binary Search Trees • Stores keys in the nodes in a way so that searching, insertion and deletion can be done efficiently. • Binary search tree property • For every node X, all the keys in its left subtree are smaller than the key value in X, and all the keys in its right subtree are larger than the key value in X

Binary Search Trees A binary search tree Not a binary search tree

Binary search trees Two binary search trees representing the same set: • Average depth of a node is O(log N); maximum depth of a node is O(N)

Implementation

Searching BST • If we are searching for 15, then we are done. • If we are searching for a key < 15, then we should search in the left subtree. • If we are searching for a key > 15, then we should search in the right subtree.

Searching (Find) • Find X: return a pointer to the node that has key X, or NULL if there is no such node • Time complexity • O(height of the tree)

Inorder traversal of BST • Print out all the keys in sorted order Inorder: 2, 3, 4, 6, 7, 9, 13, 15, 17, 18, 20

FindMin/ FindMax • Return the node containing the smallest element in the tree • Start at the root and go left as long as there is a left child. The stopping point is the smallest element • Similarly for findMax • Time complexity = O(height of the tree)

Insert • Proceed down the tree as you would with a find • If X is found, do nothing (or update something) • Otherwise, insert X at the last spot on the path traversed • Time complexity = O(height of the tree)

Hashing, B - Trees and Red – Black Trees using Parallel Algorithms &amp; Sequential Algorithms

Hashing, B - Trees and Red – Black Trees using Parallel Algorithms &amp; Sequential Algorithms

Presentation Transcript

Hashing, B - Trees and Red – Black Trees using Parallel Algorithms & Sequential Algorithms

Hashing, B - Trees and Red – Black Trees using Parallel Algorithms & Sequential Algorithms