310 likes | 404 Vues
Learn how the Parallel Irregular Tree library tackles non-homogeneous, dynamic, and unpredictable problems. Explore the application in examples like Barnes Hut and Radiosity methods, with future work insights. Develop deep knowledge of hierarchical representation and distributed tree structures for effective problem-solving.
E N D
Solving Irregular Problems Through Parallel Irregular Trees Fabrizio Baiardi Paolo Mori Laura Ricci Dipartimento di Informatica Università di Pisa Istituto di Informatica e Telematica CNR - Pisa
Outline • Irregular problems main features • Hierarchical representation of the domain • Parallel Irregular Tree library • Experimental results • Future works PDCN 2005
Irregular Problems • the domain includes a set of elements characterised by • the position in the domain • other problem specific properties • the elements distribution is • non-homogeneous • dynamic and non-predictable • the evolution of an element • depends upon that of other elements (locality) • updates the element properties • Examples • Barnes Hut • Adaptive Multigrid Methods • Radiosity methods PDCN 2005
Hierarchical Representation • the domain is recursively partitioned into a set of spaces by applying a a problem dependent condition • the Hierarchical Tree represents the decomposition and each Hnode represents either a space or an element PDCN 2005
Distributed Hierarchical Tree Htree representation distributed among the p-nodes pt = <{h0,..hn-1}, mHt> • private Htree (pHt): subtree assigned to a p-node • mapping Htree (mHt): represents the hierarchical relations among the private Htrees ( ) h0 PDCN 2005 h1 h3 h2
PIT Library defines: • PITree • PIT operations • key point: both the sequential and the parallel versions of the application are structured in terms of operations on Htrees • aims • be a simple, complete and effective parallelization tool • hide to the user the details of the parallel programming • preserve most of the sequential code PDCN 2005
PIT API • main operations • PITree creation • PITree completion • PITree update • alternative API • standard • advanced • composition of the adopted API • standard structure • customised for the specific problem PDCN 2005
PITree Creation • it creates the PITree starting from the domain elements • one (or more) pHt for each p-node • one mHt replicated in each p-node • it implements a distributed strategy to exploit memory at best • it needs some user-defined functions to manage the elements of the target problem PDCN 2005
PITree Completion (I) • standard API: • fault prevention and informed fault prevention • one function only implements the strategy • invoked before each operator PITree_completion(pht_root, stencil_0) tp_op_0(pht_root) this comes from the sequential code PDCN 2005
PITree Completion (II) • advanced API: • informed fault prevention only • two distinct functions • PITree_det_neighbours: invoked each time the neighbourhood relations among the elements changes • PITree_exch_neighbours: invoked before each operator PITree_det_neighbors(pht_root, stencil_0) PITree_exch_neighbors(pht_root, stencil_0) tp_op_0(pht_root) this comes from the sequential code PDCN 2005
PITtree Update (I) • advanced API: two distinct functions • PITree correction: • updates the mapping of the elements violating the mapping strategy • it is invoked after each operator that updates the distribution tp_op_0(pht_root) PITree_correction(pht_root) • PITree balance: • updates the mapping to redistribute the workload among the p-nodes • it is invoked after each operator that modifies the workload tp_op_0(pht_root) PITree_balance(pht_root, Tresh) PDCN 2005
PITtree Update (II) • Standard API: • one function only, PITree update, implements the PITree correction and balancing • PITree update is invoked after each operator tp_op_0(pht_root) PITree_update(pht_root, Tresh) PDCN 2005
Parallelization • Standard: • the functions of the sequential version are inserted into the standard structure • the development is straighforward • a deep knowledge of the target problem is not required • Customized • the PIT operations are inserted into the sequential code according to the semantics of the target problem • a deep knowledge of the target problem is required • both the standard and the advanced API can be adopted • it achieves a better efficiency PDCN 2005
Sequential Code irregular_problem(tElementList *dom) { ... root = Htree_creation(dom) ... while (not solution_computed) { tp_op_0(root) … tp_op_n(root) } } problem operator: mainly consists in a visit of the Htree PDCN 2005
Standard Structure irregular_problem(tElementList *dom) { ... pht_root = PITree_creation(dom, dec_el, incl_el, rem_el) ... while (not solution_computed) { PITree_completion(pht_root, stencil_0) tp_op_0(pht_root) pht_root = PITree_update(pht_root, T) …. PITree_completion(pht_root, stencil_n) tp_op_n(pht_root) pht_root = PITree_update(pht_root, T) } } PDCN 2005
Customised Structure irregular_problem(tElementList *dom) { … pht_root = PITree_creation(dom, dec_el, incl_el, rem_el) ... while (not solution computed) { PITree_det_neighbors(pht_root, stencil_0+..+stencil_i) PITree_exch_neighbors(pht_root, stencil_0) tp_op_0(pht_root) … PITree_exch_neighbors(pht_root, stencil_i) tp_op_i(pht_root) PITree_correction(pht_root) PITree_det_neighbors(pht_root, stencil_i+1+..+stencil_n) … PITree_exch_neighbors(pht_root, stencil_n) tp_op_n(pht_root) PITree_update(pht_root) } } PDCN 2005
Validation • Applications • Adaptive Multigrid Methods • Hierarchical Radiosity • Parallel architectures • PC cluster • Intel Pentium II 266MHz • 128 Mb • 100Mb Fast Ethernet • IBM Beowulf (x330) • Intel Pentium III 1.133GHz • 1GB per p-node (2 procs) • Myricom LAN (264MB) PDCN 2005
Adaptive Multigrid Methods • fast iterative methods to solve partial diff. equations • discretized and multi level domain representation through a grid hierarchy • adaptive problem: • the discretization is finer where the equation is irregular • new grids are added during the computation • Poisson Problem PDCN 2005
Sequential Code amm(tElementList *initial_grid) { root=Htree_creation(initial_grid) while (not end) { smoothing(root, v, f, all_levels) for level from Lmax downto Lg { rest(root, level) restriction(root, level-1) smoothing(root, e, r, level-1) } for level frm Lg+1 to Lmax { prolongation(root, level) correction(root, e, level) smoothing(root, e, r, level) } correction(root, v, all_levels) end = norm(root) if (not end) Lmax = refinement(root) } PDCN 2005
Parallel Code (I) amm(tElementList *initial_grid) { pht_root = PITree_creation(initial_grid, dec_el, incl_el, rem_el) while (not end) { PITree_det_neighbors(pht_root, stencil_union) PITree_exch_neighbors(pht_root, smooth-rest_stencil, all_levels) smoothing(pht_root, v, f, all_levels) for level from Lmax downto Lg { PITree_exch_neighbors(pht_root, smooth-rest_stencil, level) rest(pht_root, level) PITree_exch_neighbors(pht_root, restriction_stencil, level) restriction(pht_root, level-1) PITree_exch_neighbors(pht_root, smooth-rest_stencil, level) smoothing(pht_root, e, r, level-1) } PDCN 2005
Parallel code (II) for level frm Lg+1 to Lmax { PITree_exch_neighbors(pht_root, prolongation_stencil, level) prolongation(pht_root, level) correction(pht_root, e, level) PITree_exch_neighbors(pht_root, smooth-rest_stencil, level) smoothing(pht_root, e, r, level) } correction(pht_root, v, all_levels) PITree_exch_neighbors(pht_root, norm_stencil, level) end = norm(pht_root) if (not end) Lmax = refinement(pht_root) pht_root = PITree_update(pht_root, T) } } PDCN 2005
Domain Hierarchical Decomposition After 10 Iterations PDCN 2005
Load Balancing PDCN 2005
Efficiency PDCN 2005
Hierarchical Radiosity • a model of the light exchanges to compute the illumination of a scene • representation of the scene • discretized and hierarchical • adaptive • locality: interactions among objects at distinct abstraction levels PDCN 2005
Sequential Code hierarchical_rad(segment_list *scene) { root = Htree_creation(scene) visib_list_det(root) while (not end) { Gather_H(root) for level from L_min to L_max Push_H(root, level) for level from L_max downto L_min Pull_H(root, level) end = RefineLink_H(root) } } PDCN 2005
Parallel Code (I) hierarchical_rad(segment_list *scene) { pht_root = PITree_creation(scene, dec_el, incl_el, rem_el) PITree_exch_neighbors(pht_root, vis_stencil, all_levels) visib_list_det(pht_root) while (not end) { PITree_exch_neighbors(pht_root, int_list, all_levels) Gather_H(pht_root) for level from L_min to L_max { PITree_exch_neighbors(pht_root, push_stencil, level) Push_H(pht_root, level) } PDCN 2005
Parallel Code (II) for level from L_max downto L_min { PITree_exch_neighbors(pht_root, pull_stencil, level) Pull_H(pht_root, level) } end = RefineLink_H(pht_root) pht_root = PITree_balance(pht_root) } } PDCN 2005
Test Scene • 192 polygons • 896 segments PDCN 2005
Efficiency PDCN 2005
Future Works • the definition of the set of problems that cannot be solved adopting our methodology • the definition of programming constructs for the considered class of problems PDCN 2005