Créer une présentation
Télécharger la présentation

Download

Download Presentation

Nilesh Choudhury Parallel Programming Lab Department of Computer Science

134 Vues
Download Presentation

Télécharger la présentation
## Nilesh Choudhury Parallel Programming Lab Department of Computer Science

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Nilesh Choudhury**Parallel Programming Lab Department of Computer Science University of Illinois, Urbana Champaign InterConnection Network Topologies to Minimize graph diameter:Low Diameter Regular graphs and Physical Wire Length Constrained networks**Motivation**• Running a program on a large number of processors: • Large number of partitions • Large amount of communication • Communication is the most common bottleneck for scaling a problem to large number of machines • Point to Point communication times increase • Average hop count has increased • Collective communication times increase • Need to send larger number of messages • Average hop count has increased • Diameter has increased, so max. communication time is larger • Might be limited by the maximum time**Solutions**• Software communication Optimizations • Mapping your partitions (sequential entities) to minimize communication • Interconnection Network itself could be optimized • Minimize diameter of the network to decrease global collective operations • Broadcast • Reduction • Amount of bandwidth used • Minimize the total number of hops for all messages • Minimize average number of hops**Scope of this talk**• Interconnection Network design Optimizations • Routing Algorithms for these networks • Tradeoff between • Average hop distance • Maximum hop distance (diameter) • Simplicity of routing algorithm • Must be implemented in hardware (in as few clock cycles as possible)**Networks**• Direct Network • Each node is connected to a corresponding router • # routers = # nodes • Also called router-based networks • Indirect Network • A number of nodes is connected to one switch • Fat-trees are an example**Network Parameters**• Degree of a node • Connectivity of the node • Bisection bandwidth • The minimum bidirectional capacity of a network between two equally sized partitions of the network • Diameter • Length of the longest shortest path between any two nodes in the network • Average length of shortest path between all pairs of nodes**Common Networks**• HyperCube (N nodes): • Degree = logN • Diameter = logN • Bisection BW = N/2 • Avg. Internode Distance = (logN)/2 • Fat Tree (k-ary n-tree) (N=k^n nodes): • Degree = k • Diameter = 2n • Bisection BW = N • Avg. Internode Distance = n**Moore Graphs**• Moore graph: • N(d,k) <= (d(d-1)^k -2) / (d-2) • Very few graphs found that satisfy the Moore bound • N(nodes, degree, diameter) • Petersen graph N(10, 3, 2) • Hoffman-Singleton graph N(50, 7, 2) • N(3250, 57, 2) – possible but yet undiscovered**Low Diameter Regular Graph**• Each node has same degree logN as of a hypercube • Diameter of LDR is 2, that for hypercube is 3 • Average Internode distance for LDR is 1.375, while that for hypercube is 1.5**How to generate a LDR graph?**• LDR graph is built based on a spanning tree • No. of nodes = N • Degree = k • Start with the root. Connect it to k children • For each of the children connect them to k-1 children (each has a parent) • Till we have used all N nodes • Leafs still have unconnected edges, which could be used to decrease the diameter, etc.**How to generate a LDR graph? contd...**• To choose the incomplete connections for leaves: • Pick a vertex 'A' with max incomplete connections • Pick another vertex 'B' randomly from remaining • If (A->B) pick another vertex • Continue the previous step till there are no vertices remaining which satisfy the condition or we find a legitimate vertex • If we find a legitimate vertex, add an edgeA->B • Else, we disconnect some edge X-Y and connect A-X and B-Y.**Routing on a LDR graph**• Hamming for hypercube is simple (XOR op) • Deterministic for LDR • shortest path routing • Table driven • Need adaptivity in the presence of network contention! • 64 / 2048 node LDR and hypercube using deterministic and adaptive routing**How difficult is it to place a hypercube / LDR in physical**space? • A hypercube with N nodes • LogN dimensions • Real world is 3D • Difficult to place nodes in 'n' dimensions • Really big machines (Bluegene, RedStorm) • Use 3D torus or similar • Easier to place them in physical space • Large wire lengths mean large delays, not to mention cost • We believe for large machines, topology should consider physical placement**Framing the problem?**• The problem • a network with connectivity 'k'; • maximum allowable wire length is 'd' (hops); • Design a network topology within these constraints, with lowest diameter, average all pair internode distance • Also it should have a simple routing algorithm**Connected Graph**• In 2D: • X+, X-, Y+, Y- • These 4 connections provide connected graph • In 3D: • X+, X-, Y+, Y-, Z+, Z- • These 6 connections provide connected graph**The proposed topology**• The Remaining connections are to be used to decrease the diameter and average all pair internode distance • i=1; • Add diagonal connections of length 'd'/i along all four directions • i *= 2; • Repeat the above step while total # connections are less than 'k'**Higher dimensions connectivity of a single node**• The picture shows 2D connectivity of a single node • d/(sqrt(2)); d/(2sqrt(2)); d/(4sqrt(2)); .... • Similarily, for 3D, connectivity of a single node • d/(sqrt(3)); d/(2sqrt(3)); d/(4sqrt(3)); .... • We call these networks “PLCN” (Physical wire Length Constrained Networks)**Intuitive Proof**• Intuitively, anything along the diameter would be easy to reach • sqrt(2)*max(x,y) hops is the max number of hops • Once we reach a region by the longest hops, we explore the smaller region recursively**Optimal value of 'd'**• For different values of i, hence k, we optimize the value of 'd' which would minimize 'D' • 1D • For i=1; D=L/d + d/2; d=sqrt(L) • 2D • For i=1; D=sqrt(2)*L/d +d/sqrt(2); D= d=sqrt(2L); L*L grid • 3D • For i=1; D=sqrt(3)*L/d +d/sqrt(3); d=sqrt(3L); L*L*L grid • For i=2, the equations become sufficiently complicated**Average shortest all pair hop distance for k=8; 'd'; d=1**and d=4**Routing**• Simple routing algorithm for different dimensions • Use the longest available hop towards the destination • Within this smaller region, use the above step recursively • The final step is to reach it by using simple hops along the lowest level connections along the axis • With some care, this could be easily coneverted to adaptive minimal routing (if more than one shortest paths are available) • Non-minimal routing**Conclusions**• Communication Topology is important • Message latency should be minimized for an application to scale • Non-trivial networks like LDR and PLCN are not so difficult to implement • Can drastically reduce average all pair shortest distance and diameter