1 / 33

Scott Miao, Trend Micro s cott_miao@trend.tw @ takeshi.miao

Threat Connect : a visualized cyber-threats entity reporting system backed with H adoop ecosystem. Scott Miao, Trend Micro s cott_miao@trend.com.tw @ takeshi.miao. Who am I. RD, SPN, Trend Micro 3+ years for Hadoop eco system Expertise in HDFS/MR/ HBase @ takeshi.miao. Agenda.

garron
Télécharger la présentation

Scott Miao, Trend Micro s cott_miao@trend.tw @ takeshi.miao

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Threat Connect : a visualized cyber-threats entity reporting system backed with Hadoopecosystem Scott Miao, Trend Micro scott_miao@trend.com.tw @takeshi.miao

  2. Who am I • RD, SPN, Trend Micro • 3+ years for Hadoop eco system • Expertise in HDFS/MR/HBase • @takeshi.miao

  3. Agenda • Threat intelligence problem • Challenges and Solutions • Summary

  4. “I want to quickly get an overview of the incident, including itsscope, timeline, and impact.” Threat intelligence problem

  5. Threat Connect • A Web Service for Threat Information Report • RESTful Interface to access • Integrated with TM Deep Discovery products • Relevantand ActionableIntelligence

  6. Process and correlates different data sources … IP, domain, URL, filename, process, file hash, Virus detection, registry key, etc. Product 1 Product 2 Product 3 Most relevant threat report with actionable intelligenceon a single portal

  7. Challenges and Solutions

  8. Graph Problem Process & Correlate Moving Big Data Storing Real Time Access Pick Your right tool

  9. Moving

  10. Accumulate small files FBS FBS FBS Event Logs Hadoop Feed Back log Service Dear users/services

  11. Storing

  12. Process & Correlate

  13. Time • Batch • Performance • Store • Pig/MR • HDFS • Hbase • Solr • RDB • UDFs • MRs for special cases

  14. Real Time Access

  15. Free form search • Solr Cloud • Real Time Access • EX. Sandbox Reports • Random Access • HBase • EX. Threat Detection DBs

  16. Graph Model

  17. Active community ? Massive scalable ? Analyzable ?

  18. We use HBase as a Graph Storage • Google BigTableand PageRank • HBaseCon2012

  19. HGraph https://github.com/tinkerpop/blueprints/wiki

  20. Pick right tool

  21. Pick right tool for right usecases • Silver bullet ? • No one project fits all • One problem may has several choices http://www.neevtech.com/blog/2013/03/18/hadoop-ecosystem-at-a-glance/

  22. Summary

  23. Small files • Namenodefsimage would explore the memory • Too many map tasks to run for a job FBS FBS FBS

  24. Store your data anyway • Store all the raw data on the HDFS • Break invisible isolation from different data sources • Archive your data with deduced easy to use FileFormat • Trenvi, RC file, ORC file

  25. Know MR more • Even you are the pig developer • Deal with MR issues • Write better pig-latin • Sometimes you can only use MR

  26. Know your data & usecases • Realtime ? Batch ? • Access Pattern ? • Therefore, you can pick right tool

  27. Thank you guys

More Related