250 likes | 363 Vues
Hyperion revolutionizes high-volume packet monitoring by integrating high-speed storage and real-time indexing systems designed for extensive archival and retrospective querying. Developed by Peter Desnoyers and Prashant Shenoy at the University of Massachusetts, this system addresses crucial challenges in network forensics by enabling the capture, indexing, and analysis of packet headers. Hyperion efficiently queries millions of packet records, ensuring real-time performance without data loss, making it vital for debugging, analyzing system compromises, and improving overall network management.
E N D
Hyperion: High Volume Stream Archival for Restrospective Querying Peter Desnoyers and Prashant Shenoy University of Massachusetts
Packet monitoring with history • Packet monitor: capture and search packet headers • E.g.: Snort, tcpdump, Gigascope • … with history: • Capture, index, and store packet headers • Interactive queries on stored data • Provides new capabilities: • Network forensics: • When was a system compromised? From where? How? • Management: • After-the-fact debugging monitor storage
Challenges • Speed • Storage rate, capacity • to store data without loss, retain long enough • Queries • must search millions of packet records • Indexing in real time • for online queries • Commodity hardware 1 gbit/s x 80% ÷ 400 B/pkt = 250,000 pkts/s For each linkmonitored
Existing approaches Packet monitoring with history requires a new system. *Niksun NetDetector, Sandstorm NetInterceptor
Outline of talk • Introduction and Motivation • Design • Implementation • Results • Conclusions
Hyperion Design • Multiple monitor systems • High-speed storage system • Local index • Distributed index for query routing Distributedindex Monitor/capture Index Hyperion node Storage
Storage Requirements • Real-time • Writes must keep up or data is lost • Prioritized • Reads shouldn’t interfere with writes • Aging • Old data replaced by new • Stream storage • Different behavior Behavior:Typical app. vs. Hyperion Packet monitoring is different from typical applications
2: C 3: A 4: B Log structured stream storage • Goal: minimize seeks • despite interleaved writes on multiple streams • Log-structured file system minimizes seeks • Interleave writes at advancing frontier • free space collected by segment cleaner disk position C A C A B Write frontier 1: A • But: • General-purpose segment cleaner performs poorly on streams
skip Hyperion StreamFS • How to improve on a general-purpose file system? • Rely on application use patterns • Eliminate un-needed features • StreamFS – log structure with no segment cleaner. • No deletes (just over-write) • No fragmentation • No segment cleaning overhead • Operation: • Write fixed-size segment • Advance write frontier to next segment ready for deletion
StreamFS Design • Record • Single write, packed into: • Segment • Fixed-size, single stream, interleaved into: • Region • Contains: • Region map • Identifies segments in region • Used when write frontier wraps • Directory • Locate streams on disk record segment region Region map directory Stream_A …
StreamFS optimizations New data • Data Retention • Control how much history saved • Lets filesystem make delete decisions • Speed balancing • Worst-case speed set by slowest tracks • Solution: interleave fast and slow sections • Worst-case speed now set by average track Reservation Old data is deleted
Local Index • Requirements: • High insertion speed • Interactive query response Index and search mechanisms
Signature Keys Records Signature Index • Compress data into signature • Store signature separately • Search signature, not data • Retrieve data itself on match • Signature algorithm: Bloom filter • No false negatives – never misses a result • False positives – extra read overhead
Bytes searched Index size Signature index efficiency Overhead = bytes searched • Index size • False positives (data scan) • Concise index: • Index scan cost: low • False positive scans: high • Verbose index: • Index scan cost: high • False positive scans: low
Multi-level signature index • Concise index: • Low scan overhead • Verbose index: • Low false positive overhead • Use both • Scan concise index • Check positives in verbose index Concise index Verbose index Data records
Distributed Index • Query routing: • Send queries only to nodes holding matches • Use signature index • Index distribution: • Aggregate indexes at cluster head • Route queries through cluster head • Rotate cluster head for load sharing
Implementation • Components: • StreamFS • Index • Capture • RPC, query & index distribution • Query API • Linux OS • Python framework RPC, query, index dist. QueryAPI Index StreamFS capture Linux kernel Hyperion components
Outline of talk • Introduction and Motivation • Design • Implementation • Results • Conclusions
Experimental Setup • Hardware: • Linux cluster • Dual 2.4GHz Xeon CPUs • 1 GB memory • 4 x 10K RPM SCSI disks • Syskonnect SK98xx + U. Cambridge driver • Test data • Packet traces from UMass Internet gateway* • 400 mbit/s, 100k pkt/s *http://traces.cs.umass.edu
StreamFS – write performance • Tested configurations: • NetBSD / LFS • Linux / XFS (SGI) • StreamFS • Workload: • multiple streams, rates • Logfile rotation • Used for LFS, XFS • Results: • 50% boost in worst-case throughput • Fast enough to store 1,000,000 packet hdrs/s
StreamFS – read/write • Workload: • Continuous writes • Random reads • StreamFS: • sustained write throughput • XFS • throughput collapse StreamFS can handlestream read+write traffic without data loss. XFS cannot.
Index Performance • Calculation benchmark: • 250,000 pkts/sec • Query: • 380M packet headers • 26GB data • selective query (1 pkt returned) • Query results: • 13MB data fetched to query 26GB data (1:2000) Data fetched (MB) Index size
Results: Packets/s Loss rate 110,000 0 130,000 0 150,000 2·10-6 160,000 4·10-6 175,000 10·10-6 200,000 .001 System Performance • Workload: • Trace replay • Simultaneous queries • Speed:100-200K pkts/s • Packet loss measured: • #transmitted - #received Up to 175K pkts/s with negligible packet loss
Conclusions Hyperion - packet monitoring with retrospective queries Key components: • Storage • 50% improvement over GP file systems • Index • Insert at 250K pkts/sec • Interactive query over 100s of millions of pkts System • Capture, index, and query at 175K pkts/sec
Questions Questions?