1 / 88

TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A A A A A A A A

TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A A A A A A A A. Delaunay mesh refinement. Triangulate a given set of points. Delaunay property: No point is contained within the circumcircle of a triangle. Quality property:

adamina
Télécharger la présentation

TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A A A A A A A A

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAAAAAA

  2. Delaunay mesh refinement • Triangulate a given set of points. • Delaunay property: No point is contained within the circumcircle of a triangle. • Quality property: No bad triangles—i.e., triangles with an angle > 120o. • Mesh refinement: Fix bad triangles through an iterative algorithm.

  3. Retriangulation Cavity

  4. Sequential mesh refinement Mesh m = /* read input mesh */ Worklist wl = new Worklist(m.getBad()); foreach triangle t in wl { Cavity c = new Cavity(t); c.expand(); c.retriangulate(); m.updateMesh(c); wl.add(c.getBad()); } • Cavities are contiguous.

  5. Parallelization • Typical shared-memory implementation: • Mesh = Heap-allocated graph data structure • Nodes = triangles. Edges = adjacency • Atomicity: overlapping cavities must not be processed at the same time. • Irregular data-parallelism:Extent of parallelism depends on input. • Non-overlapping cavities processed in parallel. • In worst case, no parallelism. • In typical case, cavities are mostlynon-overlapping. • Lot of recent work, notably by Pingali et al.

  6. Social networks in epidemics* Adam Beth • Agents = parties among whom anepidemic may be spreading • Each node models an agent at a physical location (e.g., school). • At a given location, an agent interacts with the same set of other agents. • Edge = potential of interaction. • Assumption: degree of a node < 9 • Atomicity: overlapping social interactions not processed simultaneously. Chitra Eve David (*) Burke et al. Individual-based computational modeling of smallpox epidemic control strategies. Academic Emergency Medicine, 13(11):1142-1149, 2006.

  7. Irregular parallelism • Heap-allocated data structures like lists, trees, and graphs. • Almost impossible to parallelize statically. (Shape analysis does not work.) • Needed: Dynamic methods. • Needed: Programming abstractions to express and exploit whatever parallelism is permitted by the problem instance.

  8. List of irregular applications (lifted from Pingali et al.) • Delaunay mesh refinement, Delaunay triangulation • Agglomerative clustering, ray tracing • Social network maintenance • Minimum spanning tree, Maximum flow • N-body simulation, epidemiological simulation • Sparse matrix-vector multiplication, sparse Cholesky factorization • Belief propagation, survey propagation in Bayesian inference • Iterative dataflow analysis, Petri net simulation • Finite-difference PDE solution

  9. The Lonestar challenge • Lonestar benchmarks: joint project of UT Austin (Pingali’s group) and IBM. • Four widely used, large, irregularly data-parallel applications: • Delaunay mesh refinement • Delaunay triangulation • Focused community discovery in social network[K. Hildrum and P. Yu. Focused Community Discovery. IEEE Conference on Data Mining, 2005.] • Barnes-Hut N-body simulation.[J. Barnes and P. Hut. A hierarchical O(N log N) force-calculation algorithm. Nature, 324(4):446-449, 1986.] Parallelize this!

  10. Roadmap • Locality of effects • The Sociable Objects model • The Sirius language • Case studies • Implementation and evaluation • Related and future work

  11. Delaunay mesh refinement Mesh m = /* read input mesh */ Worklist wl = new Worklist(m.getBad()); foreach triangle t in wl { Cavity c = new Cavity(t); c.expand(); c.retriangulate(); m.updateMesh(c); wl.add(c.getBad()); } • Cavity = Contiguous region in the mesh. • Pattern: “Own a contiguous region, update it, release the region.”

  12. Effects of updates are local Cavity On a mesh of ~100,000 triangles from Lonestar benchmarks (about half of them bad): Average cavity size = 3.75 triangles Maximum cavity size = 12 triangles Locality of effects the essence of parallelism.

  13. Social networks in epidemics Adam Beth • Node = agent at a physical location (e.g., school). • Edge = potential of interaction. • Interactions are local. • Effects of updates restricted to neighborhoods. Chitra Eve David Ganesh

  14. Locality and current approaches • Threads + explicit locking: • Heap abstraction is global. • Threads can follow pointers anywhere unless explicitly forbidden. • Low-level and error-prone. • Monitors: • Abstraction for atomic updates on individual objects. • Missing: Atomic updates on collectives of objects (region in the heap). • Heap abstraction is global.

  15. Current approaches (contd.) • Transactions: • Heap abstraction is global. • Burden of reasoning passed to transaction manager. • In most implementations, conflicts detected by monitoring memory reads and writes. Therefore either conservative or expensive. [Pingali et al 2007, 2008] • To come up later: Galois system, PGAS languages like X10.

  16. Current approaches (contd.) • Transactions: • Heap abstraction is global. • Burden of reasoning passed to transaction manager. • In most implementations, conflicts detected by monitoring memory reads and writes. Therefore either conservative or expensive. [Pingali et al 2007, 2008] • To come up later: Galois system, PGAS languages like X10. Our goal: capture locality of effects

  17. Roadmap • Locality of effects • The Sociable Objects model • The Sirius language • Case studies • Implementation and evaluation • Related and future work

  18. Design ideas • Treat “neighborhoods” in the heap as first-class citizens. • Nighborhood = contiguous region in heap + sequential thread. Objects outside are invisible. • Primitives to declaratively, dynamically, and locally reconfigure neighborhoods. • Neighborhoods typically small, offering massive parallelism. No worst-case guarantees.

  19. Heaps, regions, neighborhoods • Heap =connected directed graphNodes = objectsLabeled edges = pointers • Region = weakly connected partition • Neighborhood = Thread restricted to a region (Better seen as short-lived tasks.)

  20. Neighborhood action: merging • Neighborhood merges with neighborhoodalong an edge • gets a bigger region • dies. • To prevent races, must not be “busy” while merge happens • Synchronization construct. Local coarsening of parallelism.

  21. Neighborhood action: splitting • Neighborhood splits into neighborhoodsthrough • Other neighborhoods not affected. • Not a synchronization construct. Local refinement of parallelism.

  22. Neighborhood action: local updates • Attempts to access objects outside region lead to exceptions. (Similar to out-of-place accesses in X10.) x = u.f;

  23. A core language (Core Sirius)

  24. Program • Collection of neighborhood class declarations. • Neighborhood class : set of local variables (including an initial variable) and an action • Variables point to local objects. • Action: set of guarded updates

  25. Program • Collection of neighborhood class declarations. • Neighborhood class : set of local variables (including an initial variable) and an action • Action: set of guarded updates Semantics: • Top-level: Nondeterministically choose a guarded update 2. Atomically execute the guard 3. Execute the update • Back to Top-level

  26. Actions • Guards do not modify the heap, but may merge neighborhoods. All synchronization happens in guards. • Update = imperative modification of regions. Also splitting. Due to isolation, no precaution to enforce atomicity of updates.

  27. Merging Control at top-level Execute S Local –variable state stays the same.

  28. Merging Control at top-level Variant: Become a neighborhood of class . Initial variable of gets the value of v.Sgets ignored. Execute S

  29. Splitting • Neighborhoods to are of class • The initial variable of points to , etc. • Local state of destroyed. • Refinement at finest granularity.

  30. Split one • Initial variable of points to . • If region gets disconnected, “main” child neighborhood is the one containing • Main child inherits the local state of parent. (Variables pointing outside region are de-initialized). • Other child-neighborhoods split into individual objects.

  31. Other updates • Attempts to access objects outside region lead to exceptions. • If region gets disconnected, main child-neighborhood contains . • Other children split into objects. x = u.f; u.f = x;

  32. Computations are unordered • No global ordering between merges and splits. • Guarantee: a neighborhood isn’t merged in the middle of an update. • Can be “killed” by a merge any time it is at the top-level.

  33. Delaunay mesh refinement • Use two neighborhood classes: Triangle and Cavity. • Cavity = contiguous region in mesh. • Each triangle: • Determines if it is bad (local check). • If so, merges with neighbors to become cavity. • Each cavity: • Determines if it is complete (local check). • If no, merges with a neighbor. • If yes, retriangulates (locally) and splits.

  34. Delaunay mesh refinement: sketch nhood Triangle:: ... action:: merge (v.f, Cavity, u) when bad?: skip nhood Cavity:: ... action:: merge (v.f) when (not complete?): skip complete?: retriangulate(); split(Triangle)

  35. Delaunay mesh refinement: sketch nhood Triangle:: ... action:: merge (v.f, Cavity, u) when bad?: skip nhood Cavity:: ... action:: merge (v.f) when (not complete?): skip complete?: retriangulate(); split(Triangle) What happens on a conflict? • Cavity i “absorbed” by cavity j. • Cavity j now has some “unnecessary” triangles. • j will later split.

  36. Granularity of parallelism • In the worst case, the whole heap merges into one neighborhood. • In typical case, we merge, but into small neighborhood. • Only as much parallelism as the input permits.

  37. Data races • Updates only modify locally isolated objects. • can merge with only when is not in the middle of an update. • Therefore, no races.

  38. Local enabling • has a locally enabled merge with when: • merge(u.f):Sin action set, or • merge(u.f) when g:S in action set and g is satisfied. • etc.

  39. Deadlocks • Classic definition: Process P waits for a resource from Q and vice versa. • Deadlock in Sociable Objects: • has a locally enabled merge with • has a locally enabled merge with • No other progress is possible. • But one of the merges can always be carried out. (A neighborhood can always be killed at top-level.) • Theorem:Assuming updates terminate, no deadlock.

  40. Responsiveness • Definition: If a merge is locally enabled in neighborhood , then eventually, one of the following happens: • The merge goes through. • is “killed” by a different neighborhood merging with it. • Enforcement requires refinement of semantics: • For each neighborhood , track in a queue all neighborhoods that have locally enabled merges with it. • Let i merge only when this queue is empty. • Otherwise, merges in the queue get precedence. • In message-passing lingo, order receives before sends. (Done by the runtime, not the programmer.) • Theorem: The refined semantics enforces responsiveness.

  41. Connection with X10 • In X10: • Places = Static memory partitions • Activities located in places. • Many activities at one place. • In Sociable Objects: • Neighborhood = “Contiguous” place + activity • One activity at one neighborhood. • Local reconfiguration. • Dynamic creation.

  42. Roadmap • Locality of effects • The Sociable Objects model • The Sirius language • Case studies • Implementation and evaluation • Related and future work

  43. Sirius: embedding of Sociable Objects into Java • Neighborhood classes: • Constructor: Cavity (v1,v2) { … } • Merges and splits pass parameters: merge (u.f, u1, u2): S; split (Triangle,u1,u2); • Read-only data: • Data structures can be set to readonly. • Writeable data can be converted to readonly at any time. • Interleaved sequential and parallel phases. (In a sense, fork-join parallelism.)

  44. Roadmap • Locality of effects • The Sociable Objects model • The Sirius language • Case studies • Implementation and evaluation • Related and future work

  45. Delaunay mesh refinement 7: nhood Cavity { 8: action { // expand cavity 9: merge(outgoingedges, TriangleObject t): 10: { outgoingedges.remove(t); 11: frontier.add(t); 12: build(); } 13: } 14: Set members; Set border; 15: Queue frontier; // current frontier 16: List outgoingedges; // outgoing edges on which to merge 17: TriangleObject initial; ... 1: nhood Triangle { 2: Triangle(TriangleObject t) { 3: if (t.isBad()) 4: become(Cavity); // become a Cavity 5: } 6: } /* end Triangle */ 50: nhood Loader { 51: Loader(String filename) { 52: ... 53: ... new TriangleObject (p1, p2, p3); … 55: split(SingleTriangle); 56: } 57: } /* end Loader */

  46. Delaunay mesh refinement (contd.) 21: void build() { 22: while (frontier.size() != 0) { 23: TriangleObjectcurr = frontier.dequeue(); 24: try { 25: if (isMember(curr)) members.add(curr); 26: else border.add(curr); // add triangles using BFS 27: for (TriangleObject n: curr.neighbors()) 28: if (notSeen(n)) frontier.add(n); 29: } catch(NonLocalException e) { // triangle not in nhood, add to merge list 30: outeredges.add(e.getObject()); } 31: } 32: if (outeredges.isEmpty()) { 33: retriangulate(); split(Triangle); 34: } 35: } 18: Cavity(Triangle t) { ... initialize data fields 19: frontier.enqueue(t); 20: build(); } ...

  47. Boruvka’s algorithm for minimum spanning tree • Intuition: • A spanning tree is a neighborhood • Two trees can merge to form a bigger tree. • At the end, we have the full spanning tree. • Initially, • Each tree (neighborhood) has one node. • Private data = list of weights of outgoing edges. • As algorithm progresses, trees merge.

  48. Minimum spanning tree // computes the new minimal edge of // the merged tree 13: void computeNewMinEdge() {...} 14: } 1: shared Node { 2: List edges; 3: List weights; 4: } 5: nhoodComputeSpanningTree { 6: Spanning tree; 7: Node root; 8: Edge minOutEdge; 9: action { 10: merge(minOutEdge) : 11: computeNewMinEdge(); 12: }

  49. Barnes-Hut N-body simulation

  50. Barnes-Hut N-body simulation • Parallelization opportunities: At each step: • Summarizing each node in the octree (computing the centre of gravity for each rectangle). • Computing forces and advancing the bodies. • Simulation step computed sequentially.

More Related