1 / 11

ONL Freelist Manager

ONL Freelist Manager. David M. Zar Applied Research Laboratory Computer Science and Engineering Department. Stats (1 ME). QM Copy Plugins. SRAM. Tx, QM Parse Plugin XScale. FreeList Mgr (1 ME). ONL NP Router. xScale. xScale. TCAM. Assoc. Data ZBT-SRAM. SRAM. 64KW. HdrFmt

jadon
Télécharger la présentation

ONL Freelist Manager

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ONL Freelist Manager David M. ZarApplied Research LaboratoryComputer Science and Engineering Department

  2. Stats (1 ME) QM Copy Plugins SRAM Tx, QM Parse Plugin XScale FreeList Mgr (1 ME) ONL NP Router xScale xScale TCAM Assoc. Data ZBT-SRAM SRAM 64KW HdrFmt (1 ME) Parse, Lookup, Copy (3 MEs) Rx (2 ME) Mux (1 ME) QM (1 ME) Tx (1 ME) NN 64KW SRAM 64KW Each SRAM Ring NN NN NN NN Plugin3 Plugin4 Plugin0 Plugin1 Plugin2 SRAM xScale Scratch Ring NN Ring NN

  3. Rsv(8b) Buffer Handle(24b) MEs -> FM (Freelist Manager) FM

  4. Buffer_Next (32b) Buffer_Size (16b) Offset (16b) Packet_Size (16b) Free_list 0000 (4b) Reserved (4b) Ref_Cnt (8b) MAC DAddr_47_32 (16b) Stats Index (16b) MAC DAddr_31_00 (32b) EtherType (16b) Reserved (16b) Reserved (32b) Packet_Next (32b) ONL DRAM Buffer and SRAM Buffer Descriptor 0x000 Empty • Normal Unicast case: • One copy of packet being sent to one output port • SRAM Buffer Descriptor Fields: • Buffer_Next: NULL • Buffer_Size: IP_Pkt_Length • Packet_Size: IP_Pkt_Length • Offset: 0x18E • Freelist: 0 • Ref_Cnt: 1 • MAC_DAddr: <result of lookup> • Stats Index: <from lookup result> • EtherType: 0x0800 (IP) • Packet_Next: <as used by QM> 0x180 Ethernet Hdr 0x18E IP Packet 0x800

  5. Buffer_Next (32b) Buffer_Next (32b) Buffer_Size (16b) Buffer_Size (16b) Offset (16b) Offset (16b) Packet_Size (16b) Packet_Size (16b) Free_list 0000 (4b) Free_list 0000 (4b) Reserved (4b) Reserved (4b) Ref_Cnt (8b) Ref_Cnt (8b) MAC DAddr_47_32 (16b) MAC DAddr_47_32 (16b) Stats Index (16b) Stats Index (16b) MAC DAddr_31_00 (32b) MAC DAddr_31_00 (32b) EtherType (16b) EtherType (16b) Reserved (16b) Reserved (16b) Reserved (32b) Reserved (32b) Packet_Next (32b) Packet_Next (32b) ONL DRAM Buffer and SRAM Buffer Descriptor • Header Buf Descriptor • Payload Buf Descriptor • Multi-copy case: • >1 copy of packet in system • This copy going from Copy to QM to go out on an output port 0x000 0x000 Empty Empty 0x180 0x180 Empty Ethernet Hdr 0x18E 0x18E Empty IP Packet 0x800 0x800

  6. Buffer_Next (32b) Buffer_Next (32b) Buffer_Size (16b) Buffer_Size (16b) Offset (16b) Offset (16b) Packet_Size (16b) Packet_Size (16b) Free_list 0000 (4b) Free_list 0000 (4b) Reserved (4b) Reserved (4b) Ref_Cnt (8b) Ref_Cnt (8b) MAC DAddr_47_32 (16b) MAC DAddr_47_32 (16b) Stats Index (16b) Stats Index (16b) MAC DAddr_31_00 (32b) MAC DAddr_31_00 (32b) EtherType (16b) EtherType (16b) Reserved (16b) Reserved (16b) Reserved (32b) Reserved (32b) Packet_Next (32b) Packet_Next (32b) ONL DRAM Buffer and SRAM Buffer Descriptor • Multi-copy case (continued): • >1 copy of packet in system • This copy going from Copy to QM to go out on an output port • Header Buf Descriptor: • SRAM Buffer Descriptor Fields: • Buffer_Next: ptr to payload buf desc • Buffer_Size: 0 (Don’t Care) • Packet_Size: IP_Pkt_Length • Offset: 0 (Don’t Care) • Freelist: 0 • Ref_Cnt: 1 • MAC_DAddr: <result of lookup> • Stats Index: <from lookup result> • Different copies of the same packet may actually have different Stats Indices • EtherType: 0x0800 (IP) • Packet_Next: <as used by QM> • Header Buf Descriptor • Payload Buf Descriptor 0x000 0x000 Empty Empty 0x180 0x180 Empty Ethernet Hdr 0x18E 0x18E Empty IP Packet 0x800 0x800

  7. Buffer_Next (32b) Buffer_Size (16b) Offset (16b) Offset (16b) Packet_Size (16b) Free_list 0000 (4b) Reserved (4b) Ref_Cnt (8b) MAC DAddr_47_32 (16b) Stats Index (16b) MAC DAddr_31_00 (32b) EtherType (16b) Reserved (16b) Reserved (32b) Packet_Next (32b) ONL DRAM Buffer and SRAM Buffer Descriptor • Multi-copy case (continued): • >1 copy of packet in system • This copy going from Copy to QM to go out on an output port • Payload Buf Descriptor: • SRAM Buffer Descriptor Fields: • Buffer_Next: NULL • Buffer_Size: IP_Pkt_Length • Packet_Size: IP_Pkt_Length • Offset: 0x18E • Freelist: 0 • Ref_Cnt: <number of copies currently in system> • MAC_DAddr: <don’t care> • Stats Index: <should not be used> • EtherType: <don’t care> • Packet_Next: <should not be used> • Header Buf Descriptor • Payload Buf Descriptor Buffer_Next (32b) Buffer_Size (16b) Packet_Size (16b) Free_list 0000 (4b) Reserved (4b) Ref_Cnt (8b) MAC DAddr_47_32 (16b) Stats Index (16b) MAC DAddr_31_00 (32b) EtherType (16b) Reserved (16b) Reserved (32b) Packet_Next (32b) 0x000 0x000 Empty Empty 0x180 0x180 Empty Ethernet Hdr 0x18E 0x18E Empty IP Packet 0x800 0x800

  8. FM() While (true) { dl_source_scr_1word() if (BufHandleNextBuffer = UC_NULL) { if (BufHandleRefCnt-- = 1) { WU_dl_buf_free(BufHandle) } else { // do nothing else… this is the standard case for TX transmitting // all but the last copy of a copied packet } } else { DataBuffer = BufHandleNextBuffer; if (BufHandleRefCnt != 1) { ERROR // RefCnt !=1 but this is pointing to a data buffer } else { BufHandle->NextBuffer = UC_NULL WU_dl_buf_free(BufHandle) if (DataBuffer->RefCnt-- = 1) WU_dl_buf_free(DataBuffer) } } • WU_dl_buf_free does the actual cleanup and enqueing of the SRAM buffer • dl_buf_free will be modified to send commands to FM (???)

  9. Performance Targets • To hit 5 Gb rate: • 76B per min IPv4 packet (64 min Enet Frame + 12B IFS) • 1.4Ghz clock rate • 5 Gb/sec * 1B/8b * packet/76B = 8.23 Mp/sec • 1.4Gcycle/sec * 1 sec/ 8.23 Mp = 170 cycles per packet • compute budget: 170 cycles • latency budget: (threads*170) • 8 threads: 1360 cycles

  10. FM Block Diagram (worst case) mem access Latency Read Scratch Ring SCR Read: 1W 60 cycles SRAM Read: 1W 150 cycles Check BufHandle NextBuffer SRAM Write: 1W 150 cycles Set BufNext = UC_NULL ctx_swap WU_dl_buf_free SRAM Enqueue --RefCnt = 0 SRAM Test-and-decr 150 cycles ctx_swap WU_dl_buf_free SRAM Enqueue TOTAL (No optimization) 510 cycles

  11. Lookup File locations • Code • src/applications/ONL_Router/src/freelistMgr/freelistMgr.uc • Include Paths • src/applications/ONL_Router/src/dispatch_loop/ONL/ • dl_source.h and dl_source.uc • dl_source() and dl_sink() functions • Other, standard, include paths (Intel SDK provided)

More Related