NoC Placement & Routing

NoC Placement & Routing A survey on Placement & Routing challenges in Network-on-Chip systems

Overview • What is the difference between SoC and NoC placement and routing? Eyal Friedman 2008

Overview cnt’d • SoC • No routers (custom I/F) • Custom PE-to-PE protocol • Direct communication • NoC • Added routers (router I/F) • Packet-based protocol • Wormhole communication (link sharing) Eyal Friedman 2008

Overview cnt’d • The differences require the PEs in NoC to have a router I/F for incoming and outgoing communication with other PEs, and must cope with messages arriving in packets. • The PEs main function, however, is the same. • So how do placement algorithms for SoC and NoC differ? Eyal Friedman 2008

The General Placement Problem (SoC) • Given a set of fixed-cells (& soft-cells) with fixed pins, and a netlist of their connectivity, find the best location for each cell. • How do we determine what is best? • Cost-function. The input to the cost-function is a placed design and a netlist. The output is a price.The best price is the best solution. • Usually we’ll want the cost-function to concentrate on Performance / Latency / Power / Temperature / Area. Eyal Friedman 2008

SoC routing • Classically, placement is done such that total wirelength will be minimized. Eyal Friedman 2008

Algorithmic approaches to SoC placement • Algorithms based on netlist partitioning • constructive approach for initial solution • iterative improvement by cell-swaps • force-directed methods • connections between cells represent "dragging forces“ • quadratic placement • analytical mathematical minimization • simulated annealing • iterative with randomness • hybrid (mix of the above) Eyal Friedman 2008

Force-Directed approach • In this method we assume that connections between modules create forces of attraction between them. • The problem with this approach is that "Forces" pull towards local minima Eyal Friedman 2008

Force-Directed approach • initial placement influence Eyal Friedman 2008

Simulated annealing approach • This method is a randomized procedure to find approximate solutions to optimization problems where greedy techniques don't work due to the presence of local minima. Eyal Friedman 2008

NoC Architecture • Routing is one of the most crucial key factors which will decide over the success of NoC architecture based systems or their failure. • Routing is, of course, directly dictated by placement. • So how can we achieve the best placement in a NoC system? • First we have to distinguish between the different types of NoC architectures… Eyal Friedman 2008

Tile-based 2D Mesh topology Eyal Friedman 2008

Regular 2-D Mesh topology Eyal Friedman 2008

Partially irregular 2D-Mesh topology • contains oversized rectangularly shaped PEs. Eyal Friedman 2008

Irregular Mesh topology • This kind of chip does not limit the shape of the PEs or the placement of the routers. It may be considered a "custom" NoC Eyal Friedman 2008

Torus topology Eyal Friedman 2008

Fat-Tree topology Eyal Friedman 2008

NoC Routing-Table • Unlike regular SoC placement, NoC placement is not complete without a Routing Table. • The Routing Table determines for each PE the route via which it will send packets to other PEs. • The routing table directly influences traffic in the NoC. • Here we can also distinguish between 2 methods: • Static routing • Dynamic (adaptive) routing Eyal Friedman 2008

Static routing • The Routing Table is constant. • The route is embedded in the packet header and the routers simply forward the packet to the direction indicated by the header. • The routers are passive in their addressing of packets (simple routers). Eyal Friedman 2008

Dynamic Routing • The routing table can change dynamically during operation. • Logically, a route is changed when it becomes slow due to other traffic. • possibly out-of-order arrival of packets. • Usually requires more virtual channels. • In this method we can identify 2 systems: • Routing altering decisions are made in the routers (smart routers) • Routing altering decisions are made in a dedicated central unit that receives traffic information from all the routers and can decide to change the routing table. Eyal Friedman 2008

NoC placement & Routing Algorithms • The algorithms can “work” with PE’s Placement and Routing Tables, given the NoC’s arhcitecture and routing system. Eyal Friedman 2008

NoC placement & Routing Algorithms • In every mapping and placement algorithm we have to define a cost function by which to determine whether our algorithm is successful or not. • Usually the cost function measures either performance, energy or temperature, or a hybrid of these. • Most of the chip power is consumed in the communication links and the routers which are constantly active. • Are NoC algorithms really different than SoC algorithms? Lets review some of them… Eyal Friedman 2008

Branch-and-Bound Algorithms • Such algorithms walk through the searching tree that represents the solution space. • Finding the optimal solution is equivalent to finding the legal leaf node which has the minimal cost. Eyal Friedman 2008

Generic Algorithms Eyal Friedman 2008

Split-Traffic Algorithms • Bandwidth requirements can be significantly reduced by splitting the traffic between cores across multiple paths. • Different routes between source and destination must all be minimal. • It’s possible that two packets traveling from source to destination in different routes might "collide". The destination PE must know how to deal with this, or the system has to make sure this doesn’t happen. Eyal Friedman 2008

Split-Traffic Algorithms Eyal Friedman 2008

Summary • The main difference between SoC and NoC is the sharing of links. • The algorithm's essence, however, is the same for SoC and NoC. In both cases we are trying to find an ideal placement (& routing) solution that will cause our design to be low in power/temperature and high in performance • The same type of algorithms work for both cases. • One noticeable difference is that in NoC routing we have much more options for design (fixed/adaptive routing , split traffic) because of the routers network. Eyal Friedman 2008

Summary cnt’d • The same P&R algorithm can be applied to various NoC topologies (mesh, fat-tree, torus). • Some topologies are better for certain designs than others. • Most of the times, when one topology is better in performance, it is worse in power consumption. Eyal Friedman 2008

Further study • Power-state PEs/routers adaptation. • A popular power-saving method is powering down PEs on the chip when they are not needed. • If a PE is shut-down and other PEs keep sending it packets, ultimately the congestion will cause a deadlock. • A design solution has to be found for this issue. Also, if the PE is shut-down, can its router be shut-down too? If so, the routing table needs to adapt to this incident. Eyal Friedman 2008

Further study cnt’d • High-Index routers • In regular 2D mesh NoCs, a router usually has 4 directions to forward an incoming packet, using 2 bits for addressing. • Adding one more address bit enables the router to forward incoming packets to up to 8 directions. • This increases the router's X-bar and complexity, but reduces the number of routers in the NoC by 4. • This could also greatly reduce power, because PEs that share the same router do not have to use the routers' links when communicating between themselves. Eyal Friedman 2008

High-Index router NoC Eyal Friedman 2008

backup Eyal Friedman 2008

Torus topology • The main problem with the mesh topology is its long diameter that has negative effect on communication latency. • Torus topology was proposed to reduce the latency of mesh and keep its simplicity. • The only difference between torus and mesh topologies is that the switches on the edges are connected to the switches on the opposite edges through wrap-around channels. Eyal Friedman 2008

Fat-Tree topology • The Fat-Tree is an indirect interconnection network based on a complete binary tree • The bandwidth of the Fat-Tree increases as it goes closer to the root. • Fat-Tree architecture is suitable for on chip network switching core. • Tree-based topologies are useful for exploiting locality of traffic. Eyal Friedman 2008

Branch-and-Bound Algorithms • Branch: An unexpanded node is selected and its next unmapped IP is enumeratively assigned to the remaining unoccupied tiles to generate the corresponding new child nodes. New Routing Tables are generated. • Bound: Each of the newly generated child nodes is inspected to see if it is possible to generate the best leaf nodes later. A node can be trimmed away without further expansion if either its cost or its Lower Bound Cost (LBC) is higher than the lowest Upper Bound Cost (UBC) that has been found. Eyal Friedman 2008

Generic Algorithms • A computational analogy of biological adaptive systems. • Iterative by design. • Generate an initial, random pool of possible solutions (chromosomes), which are evaluated in each iteration (generation) by a fitness function. • The fitness function drives towards an optimized solution to the problem. Eyal Friedman 2008

NoC Placement & Routing