A SCALABLE PIPELINE ARCHITECTURE FOR LINE RATE PACKET CLASSIFICATION ON FPGAS

A SCALABLE PIPELINE ARCHITECTURE FOR LINE RATE PACKET CLASSIFICATION ON FPGAS Authors:Jeffrey M. Wagner 、 Weirong Jiang、Viktor K. Prasanna Conf. :International Conference on Parallel and Distributed Computing and Systems (PDCS '09) Presenter : JHAO-YAN JIAN Date : 2010/10/20

Outline • Introduction • Memory balancing and waste reduction • Tree partitioning and subtree inversion • Bidirectional fine-grained mapping • Configurable rule list mapping • Hardware architecture implementation • Experimental results

Introduction(1/2) • due to the rapid growth of the rule set size and the current link rate having been pushed beyond the OC-768 rate, i.e. 40 Gbps. • such throughput is impossible using existing software-based solutions • Pipelining approach for decision tree • In an unbalanced pipeline, the “fattest” stage, which stores the largest number of tree nodes / rules. • More time is needed to access the pipeline stages containing larger local memory • need to allocate memory with the maximum size for each stage. Such overprovisioning results in memory wastage and excessive power consumption.

Introduction(2/2) • State-of-the-art techniques cannot achieve perfectly balanced memory distribution: • 1. They require the tree nodes on the same level be mapped onto the same stage.(Fine-grained mapping) • 2. Their mapping scheme is uni-directional: the subtrees partitioned from the original tree must be mapped in the same.(Bidirectional mapping) • Fine-grained mapping : • If node A is an ancestor of node B in a tree, then A must be mapped to a stage preceding the stage to which B is mapped.

Tree partitioning and subtree inversion • Inverted : Based on Largestleaf heuristic, select S with the most number of leaves from those not inverted. And then W = W− (nodesize of S root node) +(# of leaves of S) • W :total nodesize of all subtree root nodes、CAP :idealstage capacity、IFR :inversion factor、S :subtrees • Selection end : W >= (IFR × CAP)

Bidirectional fine-grained mapping • Decision tree mapping utilizes the two sets of subtrees which are mapped from roots are called the forward subtrees, while the others are called the reverse subtrees. • It manage two lists, ReadyListand NextReadyList. • Initial: Push the roots of forward subtrees and the leaves of reverse subtrees into ReadyList. • Mi :the memory address pointer for stage i.

Configurable rule list mapping • The configurable mapping algorithm maps the rule lists to a second independent pipeline. • Fine-grained mapping

Configurable rule list mapping • Sort the rule lists in ReadyList in decreasing order of ruledepth. • Dual-port mapping: allows a rule list to map its rules vertically within a single pipeline stage memory by occupying two consecutive memory addresses. • Variable-width mapping:allows a rule list to map its rules horizontally within a single pipeline stage

Hardware architecture implementation • Direction Index Table (DIT) outputs: • (1) the distance to the stage where the subtree root is stored in the decision tree pipeline • (2) the memory address of the root in that stage • (3) the mapping direction which leads the packet to the different entrances of the decision tree pipeline.

Bidirectional pipeline(1/2) • The decision tree pipeline stage memory is constructed using a dual-port synchronous read RAM.

Bidirectional pipeline(2/2) • Cut logic: • propose a circuit design using left and right shifters. • To generate the next node pointer, the MSB bits of the packet fields, number of cuts on each field, and a base address are used. • Distance checker: • allows two nodes on the same tree level to be mapped to different stages. • storing the distance to the pipeline stage where the child node is stored in the local pipeline stage memory. • When the distance value becomes 0, the child node’s address is used to access the memory in that stage.

Configurable pipeline(1/2) • The rule list pipeline stage memory is configurable based on memory port and width configurations.

Configurable pipeline(2/2) • Rule match and compare logic • If memory port and width configurations are utilized, multiple rules can be retrieved in a single stage. • Rule checker • similar to the bidirectional pipeline distance checker • The rule count field stores the distance to the pipeline stage where the end of the rule list is.

Experimental results(1/2)

Experimental results(2/2)

A SCALABLE PIPELINE ARCHITECTURE FOR LINE RATE PACKET CLASSIFICATION ON FPGAS

A SCALABLE PIPELINE ARCHITECTURE FOR LINE RATE PACKET CLASSIFICATION ON FPGAS

Presentation Transcript

Packet Classification

DART: A Programmable Architecture for NoC Simulation on FPGAs

Packet Classification

ClassBench: A Packet Classification Benchmark

Scalable Classification

A Scalable Internet Architecture

Packet classification on Multiple Fields

Compact Trie Forest: Scalable architecture for IP Lookup on FPGAs

BPC: A language for packet classification

Packet Classification On Multiple Fields

Packet Classification on Multiple Fields

Towards a Packet Classification Benchmark

A Scalable Architecture for LDPC Decoding

High-performance Architecture for Dynamically Updatable Packet Classification on FPGA

A FPGA-based Parallel Architecture for Scalable High-Speed Packet Classification

Packet Classification on Multiple Fields

Packet Classification on PLUG Architecture

Approximate Caches for Packet Classification

The Designs and Analysis of a Scalable Optical Packet Switching Architecture

Large-Scale Wire-Speed Packet Classification on FPGAs