html5-img
1 / 51

Network Security Monitoring and Analysis based on Big Data Technologies

Network Security Monitoring and Analysis based on Big Data Technologies. Bingdong Li. August 26, 2013. Outline. Motivation Objectives System Design Monitoring and Visualization Network Measurement Classification and Identification of Network Objects Conclusion Future Work. Motivation.

amber-lopez
Télécharger la présentation

Network Security Monitoring and Analysis based on Big Data Technologies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Network Security Monitoring and Analysis based on Big Data Technologies Bingdong Li August 26, 2013

  2. Outline • Motivation • Objectives • System Design • Monitoring and Visualization • Network Measurement • Classification and Identification of Network Objects • Conclusion • Future Work

  3. Motivation • Traditional security systems assume a static system • Network attacks • sophisticated • organized • targeted • persistent • dynamic • external • internal

  4. Motivation • Problem: Network Security is becoming more challenging • Resource: A Large Amount of Security Data • Network flow • Firewall log • Application log • Server log • SNMP • Opportunity: Big Data Technologies, Machine Learning

  5. Objectives • A network security monitor and analysis system based on Big Data technologies to • Measures the network • Real time continuous monitoring and interactive visualization • Intelligent network object classification and identification based on role behavior as context

  6. Objectives Network Security Big Data Machine Learning

  7. System Design • Data Collection

  8. System Design • Online Real Time Process

  9. System Design • NoSQL Storage

  10. System Design • User Interfaces

  11. System Design • The Design supports features: • Real Time Continuous Monitoring and Interactive Visualization • Network Measurement • Classification and Identification of Network Objects

  12. Monitoring and Visualization • Real Time response within a time constraint • Interactive involve user interaction • Continuously “continue to be effective overtime in light of the inevitable changes that occur” (NIST)

  13. Monitoring and Visualization • Retrieve Data • Web User Interfaces • Video Demo

  14. Monitoring and Visualization • Data Retrieving: Data are stored with IP as primary key and time slice as the secondary key in column Accessing these data is in ϑ (1)

  15. Real Time Querying

  16. Host Network Connection

  17. Network Status

  18. Top N

  19. Demo of Interactivity and Continuity Video Demo

  20. Network Measurement • A case study The Anonymity Technology Usage on Campus Network Using sFlow • Geo-Location • Usage of Anonymity Systems

  21. Geo-location of Anonymity Usage on Campus One Instance: Bahamas, Belarus, Belgium, Bulgaria, Cambodia, Chile, Colombia, Estonia, Ghana, Greece, Hungary, Ireland, Israel, Jamaica, Jordan, Korea, Mongolia, Namibia, Nigeria, Pakistan, Panama, Philippines, Slovakia, Turkey, Ukraine, Vietnam, Zimbabwe Two Instances: Chad, ChezchRep, Denmark, Hongkong, Iran, Japan, Kazakhistan, Poland, Romania, Spain, Switzerland Three Instances: Austria, France, Singapore Four Instances: Australia, Indonesia, Taiwan, Thailand

  22. Usage of Anonymity Systems

  23. Classification of Host Roles Data: Three months sFlow data from a large campus

  24. Classification of Host Roles • Algorithms • Decision Tree • On-line SVM

  25. Classification of Host Roles • Features • Ad hoc based on domain knowledge • Aggregating features for on-line classification • 24 features normalized between 0 and 1, inclusive

  26. Classification of Host Roles • Features 24 features derived from • src/dest IP address • src/dest Port number • TTL • Package Size • Transport protocol

  27. Classification of Host Roles • Ground Truth • Host Information in Active Directory • Crawler to validate its status

  28. Classification of Host Roles • Classifying Client vs. Server • Classifying Web Server vs. Web Email Server • Classifying Hosts at Personal Office vs. Public Place • Classifying Hosts at Two Different Colleges • Feature Contributions

  29. Classifying Client vs. Server

  30. Classifying Web Server vs. Web Email Server

  31. Classifying Host From Personal Office vs. Public Place

  32. Classifying Host From Two Different Colleges

  33. Accuracy • High accuracies of Host Role Classification

  34. Feature Contribution

  35. Identification of a User Data: NetFlow data from a large campus

  36. Identification of a User • Algorithms • Decision Tree • On-line SVM • Ground Truth • Host Information in Active Directory • Crawler to validate its status

  37. Identification of a User • Features Discrete probability distribution function (pdf) An Example: System Port Number [6, 8, 9, 11, 14, 30, 80, 1020] • Outliner (P) is 1%, • 80 is the interested port (S) • Number of bin 4 ( R )

  38. Identification of a User • An Example (1-0.01) * 8 to 7, the 7th is 80, bin slice size = 80 / (4-1) = 26.6 [6, 8, 9, 11, 14, 30, 80, 1020] pdf = 0.625 0.125 0.125 0.125 6,8,9,11, 14 30 80 1020

  39. Identification of a User • An Example without P and S Bin size slice is 1024/4 = 256, [6, 8, 9, 11, 14, 30, 80, 1020] pdf = 0.875 0 0 0.125 6,8,9,11, 14,30,80 1020

  40. Identify a User Among Other Users

  41. Accuracy • Identifying a particular user among other users Decision Tree 93.3% On-line Support Vector Machine 78.5%

  42. Feature Contribution

  43. Conclusion • Major Contributions • A Big Data analysis system • a conference paper • Monitoring and interactive visualization • Usage of anonymity technologies • a conference and a journal paper • Models of classification of host roles and identification and users • a conference paper

  44. Conclusion • The Big Data analysis system is high performanceand scalable • Real Time Continuous Network Monitoring and Interactive Visualization are implemented and supported by the high performance system

  45. Conclusion • Proxies and Tor are main anonymity technologies used on campus; • US, Germany, and China are the top 3 countries • Models and Features for Classification of Host roles: • client vs. server, non-web server vs. web server, personal office vs. public office, from two different colleges • Models of Features for Identification of a particular user among other users

  46. Future Work • Improvement to the Current Work • More interactive features and better user interfaces • Further analysis on user identification: features, algorithm (such as deep learning)

  47. Future Work • Extension to the Current Work • Define and filter out background traffic • Detection of operating system fingerprinting • Identity anonymity • Fusion with other network security data source

More Related