1 / 43

All Your iFRAMEs Point to Us

All Your iFRAMEs Point to Us. Niels Provos , Panayiotis Mavrommatis Google Inc. Moheeb Abu Rajab, Fabian Monrose Johns Hopkins University. 17th USENIX Security Symposium, August 2008 Speaker: Yi- Ning Chen. Outline. Background Infrastructure and Methodology Result analysis

jela
Télécharger la présentation

All Your iFRAMEs Point to Us

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. All Your iFRAMEs Point to Us NielsProvos, Panayiotis Mavrommatis Google Inc. Moheeb Abu Rajab, Fabian Monrose Johns Hopkins University 17th USENIX Security Symposium, August 2008 Speaker: Yi-Ning Chen

  2. Outline • Background • Infrastructure and Methodology • Result analysis • Prevalence of Drive-by Download • Malicious content injection • Drive-by download via ads • Malware distribution infrastructure • Post infection impact • Conclusion

  3. background

  4. Web-based attack types • Push-based • Traditional scanning and exploiting attack • can be blocked by firewalls, NATs…… • Pull-based • Web-based malware infection • Social engineering technique • Drive-by download • In this paper, we focus the problem space on drive-by download attack.

  5. Drive-by download • Attackers inject content under their control into benign websites. • When user visits the website, it will automatically download spyware, a computer virus or any kind of malware without knowledge of the user. • Landingpages (maliciousURLs): URLs that initiate drive-by download when users visit them • Landingsites: grouped landing pages by domain name. • Distributionsite: remote site that hosts malicious payloads.

  6. Drive-by Download –Content inject • Web server compromise • Exploit web server via vulnerable scripting application and inject new content to the compromised website • Injected content: usually a hidden IFRAME contain a link that redirects the visitor to a URL that hosts a script crafted to exploit the browser. • Third party contributed content • Attacker can inject the exploit URL through the posting function without compromising the web server.

  7. Drive-by download –Exploit • User visits a web site and trigger the automatic execution of exploit code. • Exploit instructs the browser to connect to a malware distribution site to get malware executable(s). • The executable automatically installed and start.

  8. Drive-by download –Evadedetection • Use randomly seeded obfuscated JavaScript in the exploit code • Use many redirection steps before the browser eventually contacts the malware distribution site.

  9. Infrastructure and methodology

  10. Methodology • Pre-processing phase • Input data: Google’s web repository • Goal: identify URLs that trigger drive-by downloads • Verification phase • Input data: URLs from pre-processing phase • Goal: whether a candidate URL is malicious

  11. Pre-processing phase • Scoring feature • “Out of place” IFRAMEs • Obfuscated JavaScript • IRFAMEs to known distribution site • Translate these features into a likelihood score. • Employ five-fold cross validation to measure the quality of the machine-learning framework. • Use average ROC curve to estimate FPR and TPR for different thresholds. • FPR: 0.001 • TPR: 0.9

  12. Verification phase (1/2) • Develop a large-scale web-honeynetthat simultaneously runs a large number of Microsoft Windows images. • To inspect a candidate URL, the system 1. first loads a clean Windows image 2. automatically starts unpatched IE 3. runs the virtual machine for two minutes

  13. Verification phase (2/2) • Heuristics score candidate URLs based on • # of created processes, # of observed registry changes, # of file system changes • A URL is • Maliciousif it meets the threshold and one of incoming HTTP responses is marked as malicious by at least one AV scanner. • Suspiciousif it meets the threshold but passes the AV scanner.

  14. Constructing the Malware Distribution Network • Distributionnetwork is defined as the set of malware delivery tree from all the landing sites that lead to a particular distribution site. • A malware delivery tree consists of the landing site (leaf), all nodes the browser visits until it contacts the malware distribution site (root) • Extract Referrer header from recorded HTTP requests the browser makes after visiting the landing site to construct the delivery tree.

  15. PREVALENCE OF DRIVE-BY DOWNLOAD

  16. Prevalence of Drive-by Download • Based on data collected from Google over Jan 2007 – Oct 2007

  17. Malicious URL in Google search (1/2) • Percentage of Google search queries that resulted in at least one URL labeled as malicious 1.3%

  18. Malicious URL in Google search (2/2) • In the top one million URLs appear in the search engine result, 6,000 URLs are verified as malicious. • Top Rank of landing page → 1,588 • Geographic locality -- Top 5 hosting countries • In China, 96%of the landing sites point to distribution sites that are also hosted in China.

  19. Impact of browsing habits • Random sample of about 7.2millions URLs • Use DMOZ to categorize URLs → 3.6 million URLs • Malicious websites are present in all website categories.

  20. Web server software • Collect all the “Server” and “X-Power-By” header token from landing pages • The results reflect the weak security practices applied by the web site administrators. • Running unpatched software increase the risk of control via server exploitation.

  21. DRIVE-BY DOWNLOAD VIA ADS

  22. Drive-by download via Ads (1/2) • Even the web page itself does not contain any exploits, insecure Ad content poses risk to advertising web sites. • Adversaries could inject content to websites without having to compromising any web servers. • For each malware delivery tree, if any intermediary node is in one of the 2,000 well known advertising networks, the landing site is infectious via Ads.

  23. Drive-by download via Ads (2/2) • 2% of the unique landing sites were delivering malware via ads. • But counting the number of ads’ appearance, the percentage is 12%. Quick and short effect

  24. Redirection steps for Ads • CDF of the # of redirection steps for Ads that successfully delivered malware. • Malware delivered via Ads exhibits longer delivery chains, in 50% of all case.

  25. Ad network’s position in delivery tree • Choosing the 5 Ad network that appear in 75% of all malware delivery tree. • The deeper a network‘s relative position, the closer it is related to the malware distribution site.

  26. Malware distribution infrastructure

  27. Size of malware distribution network • Two main types of malware distribution networks • Networks that use only one landing site → 45% • Networks that have multiple landing sites → 21,000 landing sites Use only a single landing site to avoid detection

  28. IP distribution of malware distribution server • 50% of the landing sites fell in above ranges.

  29. AS location of Malware distribution sites • Malware distribution sites’ IP addresses fall into only 500/2,517 ASes. • 95% of these sites map to only 210ASes.

  30. Unique binaries downloaded • 42% of the distribution sites delivered a single malware binary.

  31. Overlapping landing sites (1/2) • Many landing sites are shared among multiple distribution networks. • Assume a distribution network iwith a set of landing sites Xi. • The normalized pair-wise intersection of the two networks, Ci,jis calculate as, • 80% of the distribution networks share at least one landing page.

  32. Overlapping landing sites (2/2)

  33. Content replication across malware distribution sites • Using the normalized pair-wise intersection function mentioned. • In 25% of the malware distribution sites, at least one binary is shared between a pair of sites.

  34. POST INFECTION IMPACT

  35. Download executables • The average# of downloaded executables after visiting a malicious URL is 8.

  36. Processes started by executables

  37. Registry changes • Browser Helper Object: access privileged state of the browser. • Preferences: change home page, default search engine or name server. • Security: change firewall settings or disable automatic software updates. • Startup: persist across reboots.

  38. Network activity • HTTP connections originating from the browser are omitted. Due to “downloader” binaries that fetch, in some case, up to 60 binaries over HTTP. Adding the compromise machine to an IRC botnet.

  39. Anti-Virus Engine Detection Rates (1/2) • Visiting the URL caused the creation of at least one new process on VM → suspicious • Subject each binary for each of the AV scanner. # of detected samples total # of suspicious samples Detection Rate =

  40. Anti-Virus Engine Detection Rates (2/2)

  41. False Positives • Assume all suspicious binaries will eventually be discovered by the AV vendors. • Re-scan all undetected binaries two months later using latest virus definition. • All undetected binaries from rescanning step are considered false positives. (FPR < 10%) • Use a white-list to exclude popular installers exhibiting behavior similar to that of drive-by downloads.

  42. Conclusion • Malicious URLs that initiate drive-by download are spread far and wide. • 1.3% of search queries to Google’s search engine return at least one link to a malicious site. • Syndication relations exist in Ad network are being abused to deliver malware through Ads. • Anti-virus engines are lacking in their ability to protect against drive-by download.

  43. Comment • Accuracy? • Verification phase: using anti-virus engines to verify • Using final result to judge anti-virus engines

More Related