1 / 21

Cloak and Dagger

Cloak and Dagger . In a nutshell…. Cloaking Cloaking in search engines Search engines’ response to cloaking Lifetime of cloaked search results Cloaked pages in search results. Ubiquity of advertising on the Internet. Search , by and large, enjoys the primacy.

meris
Télécharger la présentation

Cloak and Dagger

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cloak and Dagger

  2. In a nutshell… • Cloaking • Cloaking in search engines • Search engines’ response to cloaking • Lifetime of cloaked search results • Cloaked pages in search results

  3. Ubiquity of advertising on the Internet. • Search, by and large, enjoys the primacy. • Search Engine Optimisation– SEO – doctoring of search results. • For benign ends such as simplifying page content, optimizing load times, etc. • For malicious purposes such as manipulating page ranking algorithms.

  4. Cloaking • Conceals the true nature of a Web site • Keyword Stuffing – Associating benign content to keywords • Attracting traffic to scam pages • Protecting the Web servers from being exposed • Not scamming those who arrive at the site via different keywords.

  5. Types of Cloaking • Repeat Cloaking • User Agent Cloaking • Referrer Cloaking (sometimes also called “Click-through Cloaking”) • IP Cloaking

  6. DAGGER Dagger encompasses five different functions – • Collection of search terms • Querying search results generated search engines • Crawling search results • Detecting cloaking • Repeating the above four processes to study variance in measurements

  7. Collection of Search Terms Two different kinds of cloaked search terms are targeted: • TYPE 1 : Search terms which contain popular words. • Aimed at gathering high volumes of undifferentiated traffic. • TYPE 2: Search terms which reflect highly targeted traffic • Here cloaked content matches the cloaked search terms.

  8. TYPE 1 : Use popular trending search terms • Google Hot Searches andterms - shed light on search engine based data collection methods, respectively • Alexa- client-based data collection methods • Twitter terms clue us on social networking trends. • Cloaked page entirely unrelated to the trending search terms • TYPE 2: set of terms catering to a specific domain • Content of the cloaked pages actually matches the search terms.

  9. Querying Search Results • Terms collected in the previous step are fed to the search engines • Study the prevalence of cloaking across engines • Examine their response to cloaking. • Top 100 search results and accompanying metadata compiled into list • “Known good” domains entries eliminated in order to false positives during data processing. • Similar entries are grouped togetherwith appropriate ‘count’.

  10. Crawling Search Results • Crawl the URL’s. • Process the fetched pages • Detect cloaking in parallel • Helps minimize any possible time of day effects. • Multiple crawls

  11. Normal search user • GooglebotWeb crawler • A user who does not click through the search result • Detect pure user-agent cloaking without any checks on the referrer. • 35% of cloaked search results for a single measurement perform pure user-agent cloaking. • Pages that employ both user-agent and referrer cloaking are nearly always malicious. • IP Cloaking - half of current cloaked search results do in fact employ IP cloaking via reverse DNS lookups.

  12. Detecting Cloaking • Process the crawled data using multiple iterative passes • Various transformations and analyses are applied • This helps compile the information needed to detect cloaking. • Each pass uses a comparison based approach: • Apply same transformations onto the views of the same URL, as seen from the user and the crawler • Directly compare the result of the transformation using a scoring function • Thresholding - detect pages that are actively cloaking and annotate them. • Used for later analysis.

  13. Temporal Re-measurement • To study lifetime of cloaked pages. • Temporal component in Dagger. • Fetch search results from search engines • Crawl and process URLs at later instances of time. • Measure the rate at which search engines respond to cloaking • Measure the duration pages are cloaked

  14. Cloaking Over Time • In trending searches the terms constantly change. • Cloakerstarget many more search terms and a broad demographic of potential victims • Pharmaceutical search terms are static • Represent product searches in a very specific domain. • Cloakers have much more time to perform SEO to raise the rank of their cloaked pages. • This results in more cloaked pages in the top results.

  15. Sources of Search Terms • Blackhat SEO – artificially boost the rankings of cloaked pages. • Search detect cloaking either directly (analyzing pages) or indirectly (updating the ranking algorithm). • Augmenting popular search terms with suggestions. • Enables targeting the same semantic topic as popular search terms. • Cloaking in search results highly influenced by the search terms.

  16. Search Engine Response • Search engines try to identify and thwart cloaking. • Cloaked pages do regularly appear in search results,. • Many are removed or suppressed by the search engines within hours to a day. • Cloaked search results rapidly begin to fall out of the top 100 within the first day, with a more gradual drop thereafter.

  17. Cloaking Duration • Cloakers manage their pages similarly independent of the search engine. • Pages are cloaked for long durations: over 80% remain cloaked past seven days. • Cloakerswill want to maximize the time that they might benefits of cloaking by attracting customers to scam sites, or victims to malware sites. • Difficult to recycle a cloaked page to reuse at a later time.

  18. Cloaked Content • Redirection of users through chain of advertising networks • About half of the time a cloaked search result leads to some form of abuse. • long-term SEO campaigns constantly change the search terms they are targeting and the hosts they are using.

  19. Domain Infrastructure • Key resource to effectively deploy cloaking in scam: • Access to Web sites • Access to domains • For TYPE I terms, majority of cloaked search results are in .com. • For TYPE II terms, cloakers use the “reputation” of pages to boost their ranking in search results

  20. Search Engine Optimization • Since a major motivation for cloaking is to attract user traffic, we can extrapolate SEO performance based on the search result positions the cloaked pages occupy. • Cloaking the TYPE I terms target popular terms that are very dynamic, with limited time and heavy competition for performing SEO on those search terms. • Cloaking TYPE II terms is a highly focused task on a static set of terms, • Provides much longer time frames for performing SEO on cloaked pages for those terms.

  21. Conclusion • Cloaking has become a standard tool in the scammer’s toolbox • Cloaking adds significant complexity for differentiating legitimate Web content from fraudulent pages. • Majority of cloaked seaarchresults remain high in rankings for 12 hours • The pages themselves can persist far longer. • Search engine providers will need to further reduce the lifetime of cloaked results to demonetize the underlying scam activity.

More Related