Scalable High-Performance Parallel Design for NIDS on Many-Core Processors

Haiyang Jiang, GaogangXie,KaveSalamatian and Laurent Mathy Scalable High-Performance Parallel Design for NIDS on Many-Core Processors

Background & Motivation Our Approach Evaluation Conclusion Outline

Signature based NIDS (de-facto standard) • Deep Packet Inspection(DPI) is a crucial component of NIDS • Consumes 70%-80% processing time Network Intrusion Detection Systems

Due to increase in traffic and ruleset Performance Challenges Traffic ↑ Ruleset ↑

Beyond Single Core Processor • Due to powerful parallelism Many-core Processors The Mother of All CPU Charts 2005/2006, Bert Töpelt, Daniel Schuhmann, Frank Völkel, Tom's Hardware Guide, Nov. 2005.

Many-core Processor-based NDIS • Higher flexibility and lower cost • But lower performance than other solutions The State of the ART

Two kinds of parallel models for NIDS • Data parallelism • Advantages • Thread isolation • Disadvantages • Memory consumption • Reference Locality Limitations of Prior Art

Two kinds of parallel models for NIDS • Function parallelism • Advantages • Fine-grained • Reference locality • Disadvantages • Stage contentions • Message transfer among stages Limitations of Prior Art

Communication Contention Bottleneck Parallel Design Issues

Dozens of cores (TILERAGX with 36 cores) • Accelerated hardware modules • mPIPE: packet capturing engine • User Dynamic Network (UDN): communication chip among cores Features of Many-core Processors Example many-core processor (TILERAGX 36)

Goal: • High-performance • Flexible • Scalable • Inexpensive • Two Schemes • Hybrid parallel scheme • Hybrid load balancing scheme Our Approach Performance • Inflexible • Expensive • Unscalable Hardware Designs • Flexible • High performance • Inexpensive • Scalable • Flexible • Inexpensive Software Designs Flexibility

Combination of two models • Data parallel among Packet Processing Modules (PPM) • Function parallel in PPM Hybrid Parallel Scheme reference

Shared Resource among PPMs • Message (MSG) pool Hybrid Parallel Scheme reference

Due to the lock of MSG pool • Exploit mPIPE to access to MSG pool in parallel • Each packet has an individual MSG structure MSG POOL Contentions The Lock for MSG pool is eliminated as each RAW packet has its corresponding MSG

Due to MSG propagation among stages • Exploit UDN to transfer MSG • Higher bandwidth and lower latency MSG Propagation Contentions

First level: PPMs • Flow based hashing for load balancing in mPIPE • Second level: Protocol processing threads • Flow based hashing for load balancing in pipeline • Third level: Detection engine threads • Rule partition balancing (RPB) Hybrid load balancing SCHEME

Each engine works on a sub-ruleset • Offline partition • Small detection engine • Packet skipping • If one engine finds any intrusion in a packet, the other engines can skip over it. • See the details in our paper Rule partition balancing (RPB)

1.5 Mpps with 9 cores • 1 Packet Capture thread • 2 Protocol Processing threads • 6 Detection Engine threads Optimal Thread allocation for Each PPM

Background & Motivation Our Approach Evaluation Conclusion Outline

TILERAGX36 processor • 1.2GHZ * 36 • Suricata (Open Source NIDS) implementation • Snort Ruleset • 7571 rules • Synthetic traffic generator Evaluation platform

7.2Gbps (100 Bytes packet) Throughput (9 cores per PPM, 4 PPMs)

Comparision

17.40 Mbps/$ • 8 times larger than MIDeA • 3 times larger than Kargus Throughput-Cost

Two parallel designs • Hybrid parallel scheme • Hybrid load balancing scheme • NIDS Evaluation on TILERAGX 36 • High throughput per dollar cost Conclusion

Thank you!

Scalable High-Performance Parallel Design for NIDS on Many-Core Processors

Scalable High-Performance Parallel Design for NIDS on Many-Core Processors

Presentation Transcript

High Performance Parallel Programming

Adaptive Cache Compression for High-Performance Processors

Parallel Processors

Joint Experimentation on Scalable Parallel Processors

Performance Scalability on Embedded Many-Core Processors

Chinese Arithmetic on Many-core Processors

On-Chip Photonic Communications for High Performance Multi-Core Processors

High Performance Processors and Systems

Optical Interconnection Networks for Scalable High-performance Parallel Computing Systems

Performance Technology for Scalable Parallel Systems

Parallel Computing on Graphics Processors

High Performance Comparison-Based Sorting Algorithm on Many-Core GPUs

Challenges for High Performance Processors

Parallel Processors

Advanced Topic: High Performance Processors

High Performance Parallel Programming

High Performance Processors

Parallel Simulations on High-Performance Clusters

Performance system for scalable parallel and distributed high-performance computing

Performance system for scalable parallel and distributed high-performance computing

Scalable Synchronization Algorithms in Multi-core Processors

Tile Processors: Many-Core for Embedded and Cloud Computing