Advanced Protocol Offloading for Network Processors: Architecture and Implementation
This paper presents a comprehensive overview of the protocol offloading mechanisms for the IXP2400 Network Processor. It details the architecture, including bus, ports, and memory (SRAM/DRAM) usage, and explores the multi-threaded processing elements essential for efficient data handling. Key operational theories for both host and IXP, including packet processing from sending through to receiving, are discussed. Future enhancements, such as interrupt handling and customized protocol headers, are proposed to improve performance and efficiency in networking applications.
Advanced Protocol Offloading for Network Processors: Architecture and Implementation
E N D
Presentation Transcript
IXP2400 Protocol Offloading Yan Luo (yluo@cs.ucr.edu) Chris Baron (cbaron@cs.ucr.edu
SDRAM (Packet buffer) SRAM (control structures) Bus Bus Output ports Input ports multi-threaded processing elements Co-processor Network Processor NP Overview
IXP OS bypass User Kernel BigPhys SendBuf RcvBuf PCI IXP DRAM Downbuf/Upbuf
Protocol Overview • Customized protocol • Contains src/dest MAC, IP, and port • Unique packet ID and length of payload • Padding for ease of reading in blocks • Add own protocol header in the future
Socket Table • Socked id • Src/dstIP • Src/dstPort • User send/recv buff start • Phys send/recv buff start (for DMA) • Send/recv buffer size • Recv pointer (where data last written)
Theory of Operation (host) • Setup • Client application calls open() system call, kernel adds new entry to lookup table, returns unique socket descriptor • Client calls setup_connection(), a macro to ioctl calls, to add src/dst IP and requested ports for connection • Client calls ixpmalloc(), a macro to mmap() kernel memory into userspace, used for send and receive buffers
Theory of Operation (host) • Send • Macro to write() function for driver • If no space left on IXP, return error, else DMA/PIO to IXP, returning number of bytes copied • Currently uses PIO, for less latency • Receive • If no data available, return error • User must loop/poll for data
(1) Host Send • User calls send data • Kernel assembles the packet header based on the socket table • Lookup the Downbuf tail/num in the bridgeCSR • Checks if downbuf is full • DMA/PIO data • Updates Downbuf tail in bridgeCSR
(2) Downbuf Manager • Polls bridgeCSR to get new tail of DOWNBUF • Examine protocol header • Fill in meta data (packet size, id etc) • Dispatch to worker ME • Examine the head of DOWNBUF, advance head if it is processed. tail head Just arrive Finished In-progress
(3) Downbuf Worker • Get a packet from Downbuf manager • Extract IP from packet • Lookup MAC and output port in IP forwarding table • Fill in MAC into packet • Fill in meta data (packet offset, len, output port) • Put the packet in TX queue
(4) XMIT ME • Get a packet from TX queue • Split packet into mpackets • Move mpackets to TBUF of MSF • Enable sending mpackets
(5) RCV ME • Waked up by MSF for new packets • Assemble and move mpackets to DRAM • Fill in meta data (packet offset, size, input port etc) • Put packet in RX queue
(6) Upbuf Manager • Get a packet from RX queue • Get meta data • Examine packet header (dstIP, destPort etc) • Look up socket table for up_rcv_ptr • Prepare DMA descriptor • Enable DMA • Update up_tail in bridge CSR
(7) Host RCV • User requests receive data • Kernel looks up the bridgeCSR from socket table • Returns the number of bytes arrived or error for no data available
Future additions • Use Interrupts • Use bridge registers more efficiently • Own protocol header