1 / 13

Theory Moon Jung Chung

Theory Moon Jung Chung. Parallel Minimum Spanning Tree (Deterministics). Each node is a super node. Repeat until only one super node For each super node, among edges which connects to another super node, select an edge with minimum merge two super nodes into one super node

jvanatta
Télécharger la présentation

Theory Moon Jung Chung

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Theory Moon Jung Chung CSE838 Lecture notes copy right: Moon Jung Chung

  2. Parallel Minimum Spanning Tree (Deterministics) Each node is a super node. Repeat until only one super node For each super node, among edges which connects to another super node, select an edge with minimum merge two super nodes into one super node How many phase? --> O(logn) phase. each phase: O(logn) time in CRCW. Actually, with priority CRCW, O(1) time. Complexity: O(logn) time with O(m) PEs with priority CRCW, where m is the number of edges. CSE838 Lecture notes copy right: Moon Jung Chung

  3. Parallel Minimum Spanning Tree (Deterministic: detailed) Repeat until there is only one super node, for each edge (x,y), if x and y are different component, component (x) = y component (y) = x For each node with priority -CW, accept the minimum value of component. Merge two super nodes into a single super node. Complexity: O(logn) time with O(m) PEs with priority CRCW, where m is the number of edges. How to avoid priority-CR? ==> If tree is a spanning tree? CSE838 Lecture notes copy right: Moon Jung Chung

  4. Parallel Spanning Tree (Probablistic) For each edge, if it connects two different super nodes, add the edge in a spanning tree, and merge two super node as a single node. For two super nodes, two edges may be selected at the same time connecting them. How about cycle? To prevent these troubles, For each super node, select an edge randomly which connects to other super node. Verify if two different super nodes selected the same edge Verify if there is no cycles If the selected edge is OK, include the edge in a spanning tree, and merge two super nodes. How many phase? --> O(logn) phase in average. each phase: O(1) time in average Complexity: O(logn) time with O(m) CREW PEs. Parallel Connected Components in EREW => Use matrix multiplication: O(log2n) time using O(n2) PEs. CSE838 Lecture notes copy right: Moon Jung Chung

  5. Parallel Models (i) Shared Memory (PRAM) -- deterministic how about probablistic? example minimum spanning tree (ii) Circuit: depth and size (iii) Alternating Turing Machine Brent Theorem: Any depth-d size-n combinational circuit with bounded fan-in can be simulated by p-processor CREW algorithm in O(n/p + d) time. proof: store inputs to the combinational circuit in the PRAM Each gate evaluate its output if all inputs are ready. If there are not enough PEs, evaluate gates in the order of depth. (depth of a gate: longest path from the primary inputs) Complexity: Let ni be number of gates at depth i. The simulation takes  ni/p  for the gates at the depth i total time: sum of i ni/p i ( ni/p + 1) = n/p + d. CSE838 Lecture notes copy right: Moon Jung Chung

  6. Parallel Models Brent Theorem for EREW: Any depth-d size-n combinational circuit with bounded fan-in, fan-out can be simulated by p-processor EREW algorithm in O(n/p + d) time. proof: For exclusive reading, output values are copied to all gates where it is used. With bounded fan-in, fan-out, it takes constant time. Reading them one by one also takes constant time. CSE838 Lecture notes copy right: Moon Jung Chung

  7. yes no Uniform Circuit L be a language. Circuit complexity of L? Definition1: f(n) = number of gates of a circuit accepting strings of length in L. Def. 1 may not be acceptable one: L = {0n| n-th TM accepts n-th input} L is not even recursively enumerable. But L has circuit complexity 1 two candidate circuits accepting a string of length. CSE838 Lecture notes copy right: Moon Jung Chung

  8. Uniform Circuit Let Ln = {w | w is in L and |w| = n} There is a family of circuits {Cn}, and generating Cn can be done using polynomial time using O(logn) space. Each gate has bounded fan-in degree. Example of non-uniform: Division circuit => O(logn) time, but generation of it will require polynomial size space! NCk = {L | there is a uniform circuit of poly size and (logn)k depth} NC = k NCk Note: SCk = {L | there is a TM with time poly and (logn)k space} Relationship between SC and NC? CSE838 Lecture notes copy right: Moon Jung Chung

  9. Alternating TM TM forks at each state. Subprocesses cannot communicate! TM has two types of states: universal existential At Universal: all branches must be accepted. Existential: one branch should lead to accepting state That is, each computation can be represented as a computation tree. Depth of computation tree: time complexity. Note: Deterministic TM: a path Parallel random access machine: processes can communicate. ASPACE (logn) = P CSE838 Lecture notes copy right: Moon Jung Chung

  10. Parallel Computation Thesis parallel computation thesis: parallel time is polynomially equivalent to sequential space. example: parallel time of vector machine is equivalent to sequential space ATIME (f(n)) and DSPACE (f(n)). ATM (S(n), T(n)): Language accepted by ATM with space S(n), time T(n). Theorem: ATM (logn, (logn)k) = NCk CSE838 Lecture notes copy right: Moon Jung Chung

  11. NC-algorithm and P-complete problems Let f be a function Input: input of f Output: compute f NC1 reducible from f to g: using oracles of g, we can construct NC1 circuits computing f. oracle gate is counted as depth logn, size n. Division: input: x and y output: x/y Reciprocal: input: x output: x-1 Powering: Input: x Output: xi expressed in n2 bits Example: Division < Reciprocal Using reciprocal, compute y-1 compute x*y-1 Reciprocal < Powering: trivial How to construct log depth powering circuit? ==> seems not easy CSE838 Lecture notes copy right: Moon Jung Chung

  12. NC-algorithm and P-complete problems Special case of function: language recognition. Let A, B be languages A is NC1 reducible to B, if using oracle, A can be solved in NC1. log space reduction is NC reduction. A NCB, and B is in NC, then A is also in NC. A is complete for P <=> for any B in P, A <logn B. Theorem: Let A be a P-complete problem (with respect to log space reduction). If A is in NC, then P  NC. CSE838 Lecture notes copy right: Moon Jung Chung

  13. P-complete and hard problems to make parallel Examples of P-complete problems: Monotone Circuit Value problem: Input: Monotone circuit (and, or gates, but without “not” gate, and values to primary input. Question: Is the output of circuit 0 with the given primary input values? Generating lexiographically smallest depth first search tree These P-complete problems may not be parallelizable! Open question: Perfect matching, depth first search (directed, undirected), integer GCD, modular exponentiation CSE838 Lecture notes copy right: Moon Jung Chung

More Related