220 likes | 318 Vues
This study presents AutoRE, a framework for generating high-quality regular expression signatures to identify botnet-based spam emails and members. It analyzes spamming botnet characteristics and trends, including frequent domain modifications and polymorphic URLs. The AutoRE framework consists of modules like URL preprocessor, group selector, and RegEx generator to capture botnet host IP addresses and spam URLs. Through a detailed analysis using sampled emails, AutoRE successfully identified thousands of spam campaigns and distinct botnet host IP addresses, spanning multiple ASes. This pioneering work showcases the effectiveness of automatically generated regular expression signatures for detecting botnet spam activity.
E N D
Spamming Botnets: Signatures and Characteristics Authors:Yinglian Xie, Fang Yu, Kannan Achan, Rina Panigrahy, Geoff Hulten+, Ivan Osipkov+ Presenter: Chia-Li Lin
References • Y. Xie, F. Yu, K. Achan, R. Panigrahy, G. Hulten, and I. Osipkov. Spamming botnets: Signatures and characteristics. In SIGCOMM, 2008
Outline • Introduction • Spam Activity Trends • AutoRE Structure • Study Results • Conclusion
Introduction • Developed a spam signature generation framework called: • AutoRE • To detect botnet-based spam emails and botnet membership • It outputs high quality regular expression signatures
Contribution • Ability to detect frequent domain modifications • In-depth analysis of identified spamming botnet characteristics and their activity trends
Two Observations • First, spammers often add random, legitimate URLs to content • legitimate and very general (e.g.,http://www.w3.org) • Second, customize polymorphic URLs
AutoRE • Automatically generating URL signatures to identify botnet-based spam campaigns • Produces two outputs: • a set of spam URL signatures • complete URL string (CU) • URL regular Expression (RE) • a related list of botnet host IP addresses
Three modules • AutoRE is comprised of the following three modules • URL preprocessor • Group selector • RegEx generator • domain-specific • domain-agnostic
Detailing and Generalization • Detailing • returns a domain specific regular expression using a keyword-based signature as input. • Generalization • returns a more general domain-agnostic regular expression by merging very similar domain-specific regular expressions
Detect Results • Using three months of sampled emails from Hotmail • November 2006, June 2007, July 2007 • AutoRE successfully detected • 7,721 spam campaigns • 340,050 distinct botnet host IP addresses • spanning 5,916 ASes.
Conclutions • This is the first successful attempt to automatically generate regular expression signatures • The existence of botnet spam signatures and the feasibility of detecting botnet hosts using them