Flattened Butterfly Topology for On-Chip Networks

Flattened Butterfly Topology for On-Chip Networks John Kim, James Balfour, and William J. Dally Presented by Jun Pang

Motivation & Goal • Most on-chip networks (2D mesh): low-radix • Pros: simple & short wires • Cons: long network diameter & energy inefficiency (many hops) • High-radix networks • Intermediate routers: reduced a lot • Small latency & lower power • Goal: how does on-chip network use high-radix routers to reduce latency & energy

On-chip network • Plentiful bandwidth due to inexpensive wires while buffers are expensive • lower cost: from smaller distance • By reducing number of channels & buffers • Concentration: several terminal nodes share resources (routers) • Latency: • Reduce hop count at the expense of TS↑to get an overall reduced latency

On-chip Flattened Butterfly Fig. 3a • Topology • Radix=10(concentration factor:4; 3:d1; 3:d2) • 2 hops • Longer wires-> deeper buffers • Non-minimal global adaptive routing (UGAL) • Load balance & performance: path diversity • Routing minimally or non-minimally • Non-minimal: minimal Direction-ordered routing (prevent deadlock) • Only 2 VCs

Bypass Channels & Microarchitecture • Goal: reduce distance traveled by packets to reduce latency and energy • Two types of muxes • Input muxes: bypass inputs or direct inputs • Output muxes: direct outputs or bypass inputs • Yield arbiter to guarantee global fairness • If primary input is idle, non-primary input is chosen • Control packet: prevent starvation • Combination of minimal and non-minimal routing

Bypass Channels (continue) • Switch architecture • Minimal: simplified crossbar switch • Non-minimal: more complexity • Non-minimal with bypass channels: less complexity • Flow control & routing • Buffers for non-primary inputs • Separate buffers for destination of control packets • Modify UGAL to support bypass channels

Evaluation • Throughput: up to 50% throughput increase compared to concentrated mesh • Power: about 38% power reduction compared to mesh • Latency: about 28% latency reduction compared to mesh

Scalability • Lower channel increasing factor than hypercube • Three ways to scale • Concentrate factor • Dimension of the flattened butterfly • Hybrid approach • Future technology helps long wires • Increasing VCs will slightly reduce latency

Conclusion & Concerns • Flattened-butterfly: • interesting idea • Maximum distance between nodes=2 • Non-minimal routing to balance load • Bypassing channel to reduce latency • Lower latency and power, high throughput compared to mesh • Concerns: • High channel count? (bigger than mesh & torus) • Low channel utilization? (due to high channel) • Control complexity? (arbitration, control packets) • Bypass channel: good idea? (How about just use non-minimal or minimal?)

Flattened Butterfly Topology for On-Chip Networks

Flattened Butterfly Topology for On-Chip Networks

Presentation Transcript

Networks-on-Chip

Networks-on-Chip

On-Chip Networks and Testing

CAD and Design Tools for On-Chip Networks

System Busses / Networks-on-Chip

Flattened Butterfly : A Cost-Efficient Topology for High-Radix Networks

Networks-on-Chip

Lecture 16: On-Chip Networks

Flattened Butterfly: A Cost-Efficient Topology for High-Radix Networks

HAT: Heterogeneous Adaptive Throttling for On-Chip Networks

Efficient Timing Channel Protection for On-Chip Networks

Gaussian Interconnections for On-Chip Networks

Throughput-Effective On-Chip Networks for Manycore Accelerators

Networks on Chip

Networks-on-Chip

On-Chip Communication: Networks on Chip (NoCs)

Networks-on-Chip

System Architecture for On-Chip Networks

Evaluating Bufferless Flow Control for On-Chip Networks

Networks-on-Chip

Networks on Chip