1 / 40

Computer architecture II

Computer architecture II. Network topologies. Plan for today Scalable interconnection networks. Basic concepts, definitions Topologies Switching Routing Performance. Outline. Basic concepts, definitions Topologies Switching Routing Performance. Formalism. Graph G=(V,E)

bendek
Télécharger la présentation

Computer architecture II

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computer architecture II Network topologies Computer Architecture II

  2. Plan for todayScalable interconnection networks • Basic concepts, definitions • Topologies • Switching • Routing • Performance Computer Architecture II

  3. Outline • Basic concepts, definitions • Topologies • Switching • Routing • Performance Computer Architecture II

  4. Formalism • Graph G=(V,E) V : switches and nodes E: communication channels (edges) e ÍV ´ V • Route: (v0, ..., vk) path of length k between nodes 0 und k, where (vi,vi+1)E • Routing distance • Diameter: the maximal route length between two nodes • Average distance • Degree: number of input (output) channels of a node • Bisection width: minimal number of parallel connections that saturates the network Computer Architecture II

  5. What characterizes a network? • Bandwidth (offered bandwidth)b = wf • where width w (in bytes) and signaling rate f = 1/t (in Hz) • Latency • Time a message travels between two nodes • Throughput (delivered bandwidth) • How much from the offered bandwidth is effectively used Computer Architecture II

  6. What characterizes a network? • Topology • physical interconnection structure of the network graph • Routing Algorithm • restricts the set of paths that messages may follow • many algorithms with different properties • Switching Strategy • how data in a message traverses a route • circuit switching vs. packet switching • Flow Control Mechanism • when a message or portions of it traverse a route what happens when traffic is encountered? Computer Architecture II

  7. Goals • Latency as small as possible • High Throughput • As many concurrent transfers as possible • Bisection width gives the potential number of parallel connection • Cost as low as possible Computer Architecture II

  8. 1 2 3 4 5 Bus (e.g. Ethernet) • Degree = 1 • diameter = 1 • No routing necessary • bisection width = 1 CSMA/CD-protocol limited bus length Simplest and cheapest dynamic network Computer Architecture II

  9. 1 2 3 4 5 Complete graph • degree= n-1 too expensive for big nets • diameter = 1 • bisection width=ën/2ûén/2ù Static Network Connection between each Pair of nodes When cutting the network into two halves, each node has connection to n/2 other nodes. There are n/2 such Nodes. Computer Architecture II

  10. 1 2 3 4 5 Ring • degree= 2 • diameter = n/2 slow for big networks • bisection width = 2 Static network A node i linked with nodes i+1 and i-1 modulo n. • Examples: FDDI, SCI, FiberChannel Arbitrated Loop, KSR1 Computer Architecture II

  11. Cray T3D und T3E. d-dimensional grid 1,2 1,3 1,1 For d dimensions • degree= d • diameter = d (dn –1) • bisection width = (dn) d–1 2,1 2,2 2,3 3,1 3,2 3,3 Static network Computer Architecture II

  12. Crossbar 1    • fast and expensive (n2 switches) • Most: Processor x memory • degree= 1 • diameter = 2 • bisection width = n/2 Ex: 4x4, 8x8, 16x16 2    3    1 2 3  switch Dynamic network Computer Architecture II

  13. 0010 0110 0011 0111 0000 0100 0001 0101 Hypercube (1) Hamming-Distance = number of bits in which the binary representation of two numbers differ Two nodes are connected if the Hamming distance is 1 Routing from x to y by decreasing the Hemming distance 0010 0011 0000 0001 Static network Computer Architecture II

  14. 0110 0010 0010 0011 0111 0011 0000 0100 0000 0001 0001 0101 Hypercube (2) k dimensions, n= 2k nodes • degree= k • diameter = k • bisection width = n/2 Two (k-1)-hypercubes are linked through n/2 edges to form a k-hypercube Intel iPSC/860, SGI Origin 2000 Computer Architecture II

  15. Omega-Network (1) • Building block: 2x2 Shuffle • Perfect Shuffle Target = cyclic left shift 000 000 001 001 010 010 011 011 100 100 101 101 110 110 111 111 Computer Architecture II

  16. 000 000 001 001 010 010 011 011 100 100 101 101 110 110 111 111 Omega-Network (2) • Log2n levels of of 2x2 Shuffle building block • dynamic network Level i looks at bit i If 0 goes up If 1 goes down See example for 100 sending to 110 Computer Architecture II

  17. Omega-Network (3) n nodes, (n/2) log2n building blocks • degree= 2 for nodes, 4 for building blocks • diameter = log2n • bisection width = n/2 • for a random permutation, n/2 messages are expected to cross the network in parallel • Extremes • If all the nodes want to send to 0, only one message in parallel • If each sends a message to himself n messages in parallel Computer Architecture II

  18. Fat Tree /Clos-Network (1) • Nodes = leaves of a tree • Tree has the diameter 2log2n „von farthest left over the root to farthest right" • Simple tree has bisection width = 1 bottleneck • Fat Tree: • Edges at level i have double capacity as edges at level i-1 • At level i expensive switches with 2i inputs and 2i outputs • Known as Clos-networks Computer Architecture II

  19. Fat Tree/Clos-Network (2) • Routing: • Direct way over the lowest common parent • When alternative exists, choose randomly. • Tolerance to node failure • diameter 2log2n, bisection width: n             CM-5                 Computer Architecture II

  20. Switching • How a message traverses the network from one node to the other • Circuit switching • One path from source to destination established • All packets will take that way • Like the telephone system • Packet switching • A message broken into a sequence of packets which can be sent across different routes • Better utilization of network resources Computer Architecture II

  21. Packet Routing • There are two basic approaches to routing packets, based on what a switch does when the packet begins arriving • Store-and-forward • Cut-through • Virtual cut-through • Wormhole

  22. Packet routing: Store-and-Forward • A packet is completely stored at a switch before being forwarded • The packet is always on at least two nodes • Pb: Switches need lots of memory for storing the incoming packets • Switching takes place step-by-step, the blocking danger is small Computer Architecture II

  23. Packet routing: Cut through • A packet may come partially into the switch and leave its tail on other nodes • It may reside on more than 2 switches • The decision to forward the packet may be taken right away • What to do with the rest of the packet if the head blocks? • Cut-through: gather tail where the head is • It degenerates into store-and-forward for high contention • Wormhole: If the head blocks the whole “worm” blocks Computer Architecture II

  24. Store&Forward vs Cut-Through Routing h(n/b + D) vs n/b + h D h: number of hops n: message size b: bandwidth D: routing delay per hop Computer Architecture II

  25. Routing Algorithm • How do I know where a packet should go? • Topology does NOT determine routing • Routing algorithms • Arithmetic • Source-based • Table lookup • Adaptive—route based on network state (e.g., contention)

  26. (1) Arithmetic Routing • For regular topology, use simple arithmetic to determine route • E.g., 3D Torus xy-routing • Packet header contains signed offset to destination (per dimension) • At each hop, switch +/- to reduce offset in a dimension • When x == 0 and y == 0, then at correct processor • Drawbacks • Requires ALU in switch • Must re-compute CRC at each hop (1,1,1) (0,1,1) (0,0,1) (1,0,1) (0,1,0) (1,1,0) (0,0,0) (1,0,0)

  27. (2) Source Based & (3) Table Lookup Routing Source Based • Source specifies output port for each switch in route • Very simple switches • No control state • Strip output port off header • Myrinet uses this • Can’t be made adaptive Table Lookup • Very small header: contains a field that is a index into table for output port • Big tables, must be kept up-to-date

  28. 110 010 111 011 100 000 101 001 Deterministic vs. Adaptive Routing • Deterministic—follows a pre-specified route • K-ary d-cube: dimension-order routing • (x1, y1)  (x2, y2) • First Dx = x2 - x1, • Then Dy = y2 - y1, • Tree: common ancestor • Adaptive—route determined by contention for output port

  29. (4) Adaptive Routing • Essential for fault tolerance • At least multipath • Can improve utilization of the network • Simple deterministic algorithms easily run into bad permutations Computer Architecture II

  30. Contention • Two packets trying to use the same link at same time • limited buffering • drop? • Most parallel machines networks block in place • Traffic may back up toward the source • tree saturation: backing up all the way long toward destination • Discard packets and inform the source about that Computer Architecture II

  31. Communication Perf: Latency • Time(n)s-d = overhead + routing delay + channel occupancy + contention delay • Overhead: time necessary for initiating the sending and reception of a message • occupancy = (n + ne) / b • n: data (payload) size • ne: packet envelope size • Routing delay • Contention Computer Architecture II

  32. Bandwidth • What affects local bandwidth? • packet density b x n/(n + ne) • routing delay b x n / (n + ne + wD) D: nr. Of cycles waiting for a routing decision w: width of the channel • contention • endpoints • within the network • Aggregate bandwidth • bisection bandwidth • sum of bandwidth of smallest set of links that partition the network • Bad if not uniform distribution of communication • total bandwidth of all the channels Computer Architecture II

  33. Interconnects Computer Architecture II

  34. Myrinet • Offered bandwidth 2+2 Gbit/s, full duplex • 5-7 s latency • Arbitrary Topology, Fat Tree/Clos-Network preferable • Routing: Wormhole, Source Routing • Cable (8+1 Bit parallel) or fiber optics • Flow-control on each link • Adaptor • programmable RISC-Processor 333 MHz, • PCI/PCI-X connection, upto 133 MHz, 64-Bit, • 8 Gb/s over PCI-X Bus uni-directional • 2 MB Computer Architecture II

  35. 16x16 crossbar Myrinet Fat Tree (128 node) Computer Architecture II

  36. Myrinet PCI-Bus-Adaptor cable connect Netw. interface Net- DMA 2 MB SRAM Host- DMA PCI Bridge LanAI CPU 2MB SRAM PCI (-X)-bridge,64 Bit, 66-133 MHz LanAI RISC, 333 MHz 2 LWL-connectors, both duplex Computer Architecture II

  37. Myrinet 16x16 crossbar • 8 computers connected in the front side (2 chanels) • On the backside 8 outputs (2 chanels) toward next level of Clos network • 32x32, two Computer Architecture II

  38. 128-nodes Clos Building block from earlier Computer Architecture II

  39. Myrinet 256+256-Clos-Network • Routing network with bisection width • 256 • Front side 256 computer connection • Back side 256 connection to next • level routing units Computer Architecture II

  40. Clos-Network with full bisection width: 64 nodes and 32 nodes Computer Architecture II

More Related