220 likes | 360 Vues
Gigabit Rate Packet Pattern-Matching Using TCAM. Fang Yu, Randy H. Katz T. V. Lakshman UC Berkeley Bell Labs, Lucent ICNP’2004. Motivation. Malicious probes and worms spread Solutions: End-host based Anti-virus software, security patches
E N D
Gigabit Rate Packet Pattern-Matching Using TCAM Fang Yu, Randy H. Katz T. V. Lakshman UC Berkeley Bell Labs, Lucent ICNP’2004
Motivation • Malicious probes and worms spread • Solutions: • End-host based • Anti-virus software, security patches • Ineffective and costly • Network based • Network Intrusion Detection Systems (NIDS) • Payload processing for thousands of complicated content patterns at line speed • Fast and scalable multi-pattern matching schemes are highly needed
Current Pattern Matching Schemes • Software based solutions • Low speed • FPGA base solutions • Do not scale well in terms of space or overall latency for large number of patterns • Bloom filters • Able to handle thousands of patterns • Build a bloom filter for each possible pattern length • Hard to handle hundreds of possible pattern lengths
Problem Definition • Pattern matching problem • Given: a set of k patterns {P1, P2, …, Pk}, k >= 1, and a packet of length n; • Goal: find all the matching patterns in the packet. • Simple patterns: • Deterministic form: specific value of the 256 values • Non-deterministic form: • Case insensitive alphabet • wildcard byte (*) • Composite patterns: • Negation(!) • Correlated patterns
TCAM • Three logic states: ‘0’, ‘1’, ‘?’ • Given an input string, TCAM reports the lowest index match if there are multiple matches • 4 ns lookup time • Single-chip density ~ 2MB • Width of each entry is configurable
Simple Pattern Matching Using TCAM • Short patterns: length <= TCAM width w • Pad with ‘?’ if less than w • Organize patterns according to lengths in descending order • Input packet shift one byte at a time • Throughput: 2Gbps
Simple Pattern Matching Using TCAM • Long patterns: length > TCAM width w • Divide long pattern to multiple short patterns • Prefix pattern: first w bytes • Suffix patterns: remaining every w bytes. If the last suffix pattern is less than w bytes, pad it in the front with preceding bytes. • Example: DEFGABCDL • • DEFG--------------------prefix pattern • ABCD • BCDL ------ Suffix patterns
TCAM Index 1 A B C D 2 D E F G 3 B C D L 4 G D E F 5 D E F ? Patterns in TCAM
TCAM Index 1 A B C D 2 D E F G 3 B C D L 4 G D E F 5 D E F ? Data Structures in SRAM DEFGABCD (3) • Combined Pattern Table
Data Structures in SRAM • Matching Table • Partial Hit List (PHL) • Generated during matching process
Algorithm for Long Pattern Matching Combined Pattern Table Matching Table Partial Hit List (PHL)
Composite Pattern Matching • Correlated Patterns • Partial hit record for sub-patterns kept in PHL because distance between two sub-patterns can be larger than w • Example: content: “user”; content: “root”; within 20 • prefix: user; suffix: root; distance: 4-20 ---- 17 entries in matching table • Pattern with negations • Usually part of a correlated pattern • Pattern with wildcards • Distance between upper case character and its corresponding lower case character is 32.
Analysis • What is the impact of TCAM width on the scheme? * k patterns, mi bytes each, TCAM width w, and random input stream
Analysis • What is the impact of memory lookups on system scan rate? • Two kinds of memory lookups can be pipelined • With small TCAM hit rate and PHL size, overall scan time is dominated by TCAM lookup time
Malicious Attacks? • Correlated patterns can cause problem • Distance between sub-patterns can be larger than w • -- PHL size Backlogged memory lookups Scan rate • Sub-patterns can be short • -- Hit rate PHL size Scan rate • The probability of matching two patterns of 1 byte apart is very small, but packing sub-patterns consecutively to form a long packet can create a large PHL • Limit max distance between sub-patterns
Simulation Results • Rule sets: • ClamAV (v0.15) virus signature database • 1768 simple patterns • Average pattern length = 55 bytes • Pattern length: 6 ~ 2189 bytes • SNORT (v2.1.2) • 1039 simple patterns, 527 correlated patterns • Mostly 10 ~ 100 bytes, some 1 ~ 4 bytes long • Packet traces: • Real – MIT trace (1M), Berkeley trace (6M) • Synthetic – Randomly insert patterns in packet payload
10000 10000 1000 1000 100 TCAM Space (KB) Matching Table Size (MB) 100 10 1 10 0 1 0 4 8 TCAM width 16 32 64 128 256 512 1024 (in bytes) TCAM Spaces Consumed Memory Space for Mapping Table ClamAV Pattern Set • w = 128 bytes • TCAM = 240KB • SRAM < 10MB
ClamAV Pattern Set • Avg PHL: Mean of average PHL size over all packets • AvgMax PHL: Mean of maximum PHL size over all packets • Max: Maximum PHL size in all packets PHL size for ClamAV pattern set with real traces
ClamAV Pattern Set PHL size for ClamAV pattern set with synthetic traces • SRAM lookup can catch up with the TCAM lookup • Scan rate = 2Gbps
SNORT Pattern Set • w = 128, TCAM size = 295KB PHL size for SNORT pattern set with real traces
SNORT Pattern Set • Scan Ratio = Total scan time/Total TCAM lookup time • Memory Ratio = SRAM access time/TCAM access time Effects of Memory ratio on scan ratio • Scan rate > 1Gbps
Conclusion • A simple multi-pattern matching algorithm using TCAM • Support thousands of patterns with variable lengths • Support long patterns, correlated patterns, pattern with negation and wildcards • Achieve multi-gigabit rate on ClamAV and SNORT pattern sets