Download
modeling network traffic as images n.
Skip this Video
Loading SlideShow in 5 Seconds..
Modeling Network Traffic as Images PowerPoint Presentation
Download Presentation
Modeling Network Traffic as Images

Modeling Network Traffic as Images

122 Vues Download Presentation
Télécharger la présentation

Modeling Network Traffic as Images

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Modeling Network Traffic as Images Seong Soo Kim and A. L. Narasimha Reddy Computer Engineering Department of Electrical Engineering Texas A&M University {skim, reddy}@ee.tamu.edu

  2. Contents • Introduction and Motivation • Network Traffic as Images • Visual Representation • Requirements for Representing Network Traffic as Images • Sampling Rates • Visual modeling Network Traffic as Images • normal traffic, semi-random attacks, random attacks • Image Processing for Network Traffic • Validity of intra-frame DCT • Inter-frame differential coding • Conclusion Texas A & M University ICC 2005

  3. Contents • Introduction and Motivation • Network Traffic as Images • Visual Representation • Requirements for Representing Network Traffic as Images • Sampling Rates • Visual modeling Network Traffic as Images • normal traffic, semi-random attacks, random attacks • Image Processing for Network Traffic • Validity of intra-frame DCT • Inter-frame differential coding • Conclusion Texas A & M University ICC 2005

  4. Attack/ Anomaly • Bandwidth attacks/anomalies, Flash crowds • DoS – Denial of Service : • UDP flooding, TCP SYN flooding, ICMP flooding • Typical Types: • Single attacker (DoS) • Multiple Attackers (DDoS) • Multiple Victims (Worm) • Aggregate Packet header data as signals • Signal/image based anomaly/attack detectors Texas A & M University ICC 2005

  5. Motivation (1) • Previous studies looked at individual flow’s behavior • Partial state • RED-PD • These become ineffective with DDoS  Aggregate • Link speeds are increasing • currently at G b/s, soon to be at 10~100 G b/s • Need simple, effective mechanisms to implement at line speeds. • Look at aggregate information of traffic • Use sampling to reduce the cost of processing • Process aggregate data to detect anomalies. Texas A & M University ICC 2005

  6. Motivation (2) • Signature (rule)-based approaches are tailored to known attacks • Look for packets with port number #1434 (SQL Slammer) • Become ineffective when traffic patterns or attacks change • New threats are constantly emerging • Do not want to rely on attack specific information • Most current monitoring/policing tools are done off-line • Flowscan, FlowAnalyzer, AutoFocus • Quick identification of network anomalies is necessary to contain threat • Can we design generic (and generalized) mechanisms for attack detection and containment? • Measurement (network)-based real-time detection Texas A & M University ICC 2005

  7. Contents • Introduction and Motivation • Network Traffic as Images • Visual Representation • Requirements for Representing Network Traffic as Images • Sampling Rates • Visual modeling Network Traffic as Images • normal traffic, semi-random attacks, random attacks • Image Processing for Network Traffic • Validity of intra-frame DCT • Inter-frame differential coding • Conclusion Texas A & M University ICC 2005

  8. Packet Header • Carry a rich set of information • Data : Packet counts, Byte counts, Number of Flows • Domain : source/destination Address, source/destination Port numbers, Protocol numbers • Image/Video can represent each data in each domain • Image processing/Video analysis decipher the patterns of traffic • single  multiple (Worm) : horizontal lines • multiple  single (DDoS) : vertical lines Texas A & M University ICC 2005

  9. Domain size Reduction(1) • Header fields may have large domain spaces • IPv4 addresses 232, IPv6 addresses 264 • Need to minimize storage and processing complexity for real-time processing • Employ “domain folding” • For example: A data structure of a 2 dimensional array count[i][j] • To record the packet count for the address j in ith field of the IP address • Effects • 32-bit address into four 8-bit fields • Smaller memory 232 (4G)  4*256 (1K) • Running time O(n) to O(lgn) • Form of hashing • Advantages • It is possible to reverse the hashing to identify the target IP address restrictively Texas A & M University ICC 2005

  10. Data structure for reducing domain size (2) • Simple example • IP 1 = 165. 91. 212. 255, No. of Flows = 3 IP 2 = 64. 58. 179. 230, No. of Flows = 2 IP 3 = 216. 239. 51. 100, No. of Flows = 1 IP 4 = 211. 40. 179. 102, No. of Flows = 10 IP 5 = 203. 255. 98. 2, No. of Flows = 2 0 64 128 192 255 3 3 3 3 Texas A & M University ICC 2005

  11. 0 64 128 192 255 2 3 2 10 1 10 2 3 1 2 1 2 12 3 2 1 10 2 3 Data structure for reducing domain size (2) • Simple example • IP 1 = 165. 91. 212. 255, No. of Flows = 3 IP 2 = 64. 58. 179. 230, No. of Flows = 2 IP 3 = 216. 239. 51. 100, No. of Flows = 1 IP 4 = 211. 40. 179. 102, No. of Flows = 10 IP 5 = 203. 255. 98. 2, No. of Flows = 2 Texas A & M University ICC 2005

  12. Visual Representation Texas A & M University ICC 2005

  13. Contents • Introduction and Motivation • Network Traffic as Images • Visual Representation • Requirements for Representing Network Traffic as Images • Sampling Rates • Visual modeling Network Traffic as Images • normal traffic, semi-random attacks, random attacks • Image Processing for Network Traffic • Validity of intra-frame DCT • Inter-frame differential coding • Conclusion Texas A & M University ICC 2005

  14. Image based analysis • Generating useful signals based on traffic image • Treat the traffic data as images • Apply image processing based analysis • Enables applying image/video processing for the analysis of network traffic. • Some attacks become clearly visible to the human eye. • Video compression techniques lead to data reduction • Scene change analysis leads to anomaly detection • Motion prediction leads to attack prediction • Pattern recognition leads to anomaly identification Texas A & M University ICC 2005

  15. Impacts of Design Factors for presenting Network traffic as Images (1) • Sampling Rates • For discriminating current traffic situation based on stationary property, we should select a sampling frequency for deriving the most stable images • The periodicity of traffic Texas A & M University ICC 2005

  16. Impacts of Design Factors for presenting Network traffic as Images (2) • Sampling Rates • The traffic is stationary in normal times and the selection of sampling period is not crucial. • The traffic changes dynamically with time in attack times and the sampling period is a crucial factor. • 30 ~ 120 sec. sampling. Texas A & M University ICC 2005

  17. Flow-based Network Traffic Images • The number of flows based visual representation • The number of flows in (source/destination) address domain • The black dots/lines illustrate more concentrated traffic intensity. • An analysis is effective for revealing flood types of attacks • Image reveals the characteristics of traffic • Normal behavior mode • A single target (DoS) • Semi-random target : a subnet is fixed and other portion of address is changed (Prefix-based attacks) • Random target : horizontal (Worm) and vertical scan (DDoS) Texas A & M University ICC 2005

  18. Network traffic as images – normal network traffic • Standard deviation of most significant DCT coefficients of images • energy distribution of number of flows over address domain. • At normal traffic state, this signal is at a middle level between later two anomalous cases. • Legitimate flows do not form any regular shape due to their random distribution over address domain. Texas A & M University ICC 2005

  19. Network traffic as images – semi-random targeted attacks • The difference between attackers (or victims) and legitimate users is remarkable • higher variance than normal traffic • The specific area of data structure is shown in a darker shade. • traffic is concentrated on a (aggregated) single destination or a subnet. Texas A & M University ICC 2005

  20. Network traffic as images –random targeted attacks • All of the addresses are exploited in hostscans attacks • Uniform intensity  low variances • Whole region of the image in uniform intensity. • Horizontal/vertical lines indicate anomalies in 2D image • Random (sequential, dictionary scan) attacks • Horizontal scan : From the same source aimed at multiple targets -- Worm propagation • Vertical scan : From several machines (in a subnet) to a single destination -- DDOS • Worm propagation type attack • DDoS propagation type attack Texas A & M University ICC 2005

  21. Summary of Visual representation of traffic data • Worm attacks – horizontal line in 2D image • DDoS attacks – vertical line in 2D image • Line detection algorithm • Visual images look different in different traffic modes • Motion prediction can lead to attack prediction Texas A & M University ICC 2005

  22. Contents • Introduction and Motivation • Network Traffic as Images • Visual Representation • Requirements for Representing Network Traffic as Images • Sampling Rates • Visual modeling Network Traffic as Images • normal traffic, semi-random attacks, random attacks • Image Processing for Network Traffic • Validity of intra-frame DCT • Inter-frame differential coding • Conclusion Texas A & M University ICC 2005

  23. Generation of useful Signal Scene change analysis - DCT • We can apply various image processing techniques • From generated images, we can generate useful signals through DCT (Discrete Cosine Transform) • DCT is effective for storage reduction and approximation of the energy distribution in image • Variance of leading DCT coefficients in 8-by-8 blocks  Instead of whole DCT coefficients, we can choose only the dominant coefficient Texas A & M University ICC 2005

  24. Impact of Selecting DCT coefficients (1) • TCG (GT) : Transformation Coding Gain • TCG measures the amount of energy packed in the low frequency (leading) coefficient • The higher TCG leads to smaller intra-frame MSE and higher compression Texas A & M University ICC 2005

  25. Impacts of Selecting DCT coefficients (2) • Intra_frame DCT • Random traffic can be packed within fewer coefficients than semi-random traffic • Using inter-frame differential coding,we can improve the GT • For MSE of 0.3349, the required coefficients reduce from 42 to 3 • TCG increases 2.6 times Texas A & M University ICC 2005

  26. Impacts of Design Factors for presenting Network traffic as Images • Sampling rates on DCT coefficients • A sampling rate of 60 seconds maintains the minimum intra-frame MSE over the entire range of retained DCT coefficients • We can choose 30 ~ 120 sec. as appropriate sampling period. Texas A & M University ICC 2005

  27. Attack Estimation (1)- Motion prediction • Step 1: complexity reduction • Pixels below a mean packet count • Normalized absolute difference similarity • Step 2: to find a block of addresses Texas A & M University ICC 2005

  28. Attack Estimation (2)- Motion prediction • Step 3: to calculate the quantitative components • Starting position • Motion vector • Step 4: compensating errors Texas A & M University ICC 2005

  29. Advantages • Not looking for specific known attacks • Generic mechanism • Works in real-time • Latencies of a few samples • Simple enough to be implemented inline Texas A & M University ICC 2005

  30. Contents • Introduction and Motivation • Network Traffic as Images • Visual Representation • Requirements for Representing Network Traffic as Images • Sampling Rates • Visual modeling Network Traffic as Images • normal traffic, semi-random attacks, random attacks • Image Processing for Network Traffic • Validity of intra-frame DCT • Inter-frame differential coding • Conclusion Texas A & M University ICC 2005

  31. Conclusion • We studied the feasibility of analyzing packet header data through Image and DCT analysis for detecting traffic anomalies. • We evaluated the effectiveness of our approach by employing network traffic. • Can rely on many tools from signal/image processing area • More robust offline analysis possible • Concise for logging and playback • Real-time resource accounting is feasible • Real-time traffic monitoring is feasible • Simple enough to be implemented inline Texas A & M University ICC 2005

  32. Thank you !! Texas A & M University ICC 2005

  33. Processing and memory complexity • Two samples of packet header data 2*P, P is the size of the sample data • Summary information (DCT coefficients etc.) over samples S • Total space requirement O(P+S) • P is 232 4*256 = 1024 (1D), 264  256K (2D) • S is 32*32  16 • Memory requires 258K • Processing O(P+S) • Update 4 counters per domain • Per-packet data-plane cost low. Texas A & M University ICC 2005