1 / 12

Block Design Review: ONL NP Router Multiplexer (MUX)

Block Design Review: ONL NP Router Multiplexer (MUX). Mart Haitjema mah5@cse.wustl.edu http://www.arl.wustl.edu/projects/techX/design/design.html. Revision History. 5/1/07 (MAH): Released. Tx, QM Parse Plugin XScale. FreeList Mgr (1 ME). Stats (1 ME). QM Copy Plugins. SRAM.

rsouth
Télécharger la présentation

Block Design Review: ONL NP Router Multiplexer (MUX)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Block Design Review:ONL NP RouterMultiplexer (MUX) Mart Haitjema mah5@cse.wustl.edu http://www.arl.wustl.edu/projects/techX/design/design.html

  2. Revision History • 5/1/07 (MAH): • Released

  3. Tx, QM Parse Plugin XScale FreeList Mgr (1 ME) Stats (1 ME) QM Copy Plugins SRAM ONL NP Router xScale xScale TCAM Assoc. Data ZBT-SRAM 64KW SRAM 64KW HdrFmt (1 ME) Parse, Lookup, Copy (3 MEs) Rx (2 ME) Mux (1ME) QM (1 ME) Tx (1 ME) NN 64KW SRAM 32KW Each SRAM Ring NN NN NN NN Plugin4 Plugin5 Plugin1 Plugin2 Plugin3 SRAM xScale Scratch Ring NN Ring NN (Slide modified from ONL_NProuter.ppt)

  4. Contents • Overview • MUX Function • Handling RX • Configurable Multiplexer Policy • Design • Compute & Latency Budget • Design Overview • Implementation Status

  5. Overview - Function • Multiplex input from: • RX  MUX • 2 Word per pkt • 64KW SRAM Ring • 64KW/2 = 32K pkts • xScale  MUX • 3 Word per pkt • 64KW SRAM Ring • 64KW/3 = 21.3K pkts • Plugins  MUX • 3 Word per pkt • 64KW SRAM Ring • 64KW/3 = 21.3K pkts • To Parse-Lookup-Copy • MUX  PLC • 3 Word per pkt • 256 Word Scratch Ring • 256/3 = 85 pkts xScale 64KW 64KW Mux (1ME) RX PLC 64KW Plugins

  6. Buf Handle(32b) Eth. Frame Len (16b) Reserved (12b) InPort (4b) Reserved (5b) Src (2b) PT (1b) 1 7 3 2 0 Overview - Handling RX • Modify Header Buffer Descriptor from RX Parse, Lookup, Copy (3 MEs) Rx (2 ME) 64KW Mux (1 ME) Flags: Src: Source (2b): 00: Rx 01: XScale 10: Plugin 11: Undefined PT(1b): PassThrough(1)/Classify(0) Reserved (5b) Rsv (4b) Out Port (4b) Buffer Handle(24b) SRAM 64KW Each L3 (IP, ARP, …) Pkt Length (16b) QID(16b) Plugin Tag (5b) In Port (3b) Flags (8b) Stats Index (16b) NN Rsv (8b) Buffer Handle(24b) Plugin0 Plugin1 xScale (Slide modified from ONL_NProuter.ppt)

  7. Overview - Handling RX • Mux Block writes: • Buffer_size  (frame length from Rx) -14 • Packet_size  (frame length from Rx) -14 • Offset  0x18E • Freelist  0 • Ref_cnt  1 (Slide from ONL_NProuter.ppt)

  8. Overview - Multiplexer Policy • MUX should service input queues based on a configurable policy • Round-Robin Policy • Queues are serviced in round-robin fashion • Each input queue is assigned a quantum which specifies the number of packets (0 to 255) to be serviced from queue (if available) before moving on to the next queue • Quantum value of 0 means skip queue unless all other queues are empty • Quantum values are stored as 3 contiguous bytes in scratch memory

  9. Compute & Latency Budget • What is our performance target? • To hit 5 Gb rate: • Minimum Ethernet frame: 76B • 64B frame + 12B InterFrame Spacing • 5 Gb/sec * 1B/8b * packet/76B = 8.22 Mpkt/sec • IXP ME processing: • 1.4Ghz clock rate • 1.4Gcycle/sec * 1 sec/ 8.22 Mp = 170.3 cycles per packet • Compute budget: 1 ME thus 170 cycles per packet • Latency budget: (threads*170) • 1 ME: 1 threads: 170 cycles • 1 ME: 4 threads: 680 cycles • 1 ME: 8 threads: 1360 cycles (Slide modified from ONL_NProuter.ppt)

  10. Design Overview Wait For prev. sig_start Read Quantum Values 60 Cycles Read All Occupancy Counters Swap Select Queue Service Plugins Service RX 300 Cycles Service xScale 150 Cycles Read Plugins Input Ring Read xScale Input Ring Read RX Input Ring Write Plugins Occupancy Counter Write RX Occupancy Counter Write xScale Occupancy Counter Signal next_start Signal next_start Signal next_start Swap Format & Write Buffer Descriptor Swap Update Stats Counter 60 Cycles Write PLC Output Ring (dl_sink) Latency Total: ~420 Swap

  11. Implementation Status • MUX Assembly Stub: • Currently reads only from RX • Performs most of functionality for RX • Need to Implement: • Thread ordering • Quantum Policy • Conditional block to process from Plugins and xScale • Read and Write Occupancy Counters

  12. File locations (in …/ONL_Router/) • Code • src/mux/ONL/mux.c • Includes • src/dispatch_loop/ONL/dl_source.[h,c] • dl_source() and dl_sink() functions

More Related