1 / 26

Detecting Phishing in Emails

Detecting Phishing in Emails. Srikanth Palla Ram Dantu University of North Texas, Denton. What is Phishing?. Phishing is a form of online identity theft Employs both social engineering and technical subterfuge

becky
Télécharger la présentation

Detecting Phishing in Emails

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Detecting Phishing in Emails Srikanth Palla Ram Dantu University of North Texas, Denton

  2. What is Phishing? • Phishing is a form of online identity theft • Employs both social engineering and technical subterfuge • Targets consumers' personal identity data and financial account credentials such as credit card numbers, account usernames, passwords and social security numbers. • Social-engineering schemes use 'spoofed' e-mails to lead consumers to counterfeit websites. -Anti Phishing Working Group (APWG)

  3. Phishing Tactics • Hijacking reputable brand names • Creating a plausible premise • Redirecting URL’s • Collecting confidential information through emails

  4. Do we need to restrict Phishing attacks?

  5. The Statistics… Sources: Anti Phishing Working Group

  6. Problems with Current Spam Filtering Techniques • Current spam filters focus on analyzing the content • Majority of the Phishers obfuscate their email content to bypass the email filters • Labels an email as BULKand expect the recipients’ to make a decision on the authenticity of the email source • Current spam filters have high degree of false positives

  7. Methodology Our method examines: • The header of the email (not content) • The social network of the recipient • Credibility of the source • Classifies Phishers as: • Prospective Phishers • Recent Phishers • Suspects • Serial Phishers

  8. Traffic Profile The following Figure describes the incoming email traffic profiles based on number of recipients and how often they receive the message.

  9. Email Corpus Traffic Profile • Our analysis requires sent email folder of the recipient • Emails provided in the TREC evaluation tool kit are spam and non spam emails • We require a mix of legitimate and phising emails to evaluate our filter • We have analyzed a live corpus of 13,843 emails, collected over 2.5 years. This corpus has a mix of legitimate, spam and phishing emails. Different categories of emails are shown in the figure

  10. Experimental Setup • We deployed our classifier on a recipient’s local machine running an IMAP proxy and thunderbird (MUA). • All the recipient’s emails were fed directly into our classifier by the proxy. • Our classifier periodically scans the user’s mailbox files for any new incoming emails. • DNS-based header analysis, social network analysis, wantedness analysis were performed on each of the emails. • The end result is tagging of emails as either Phishing, Opt-outs, Socially distinct and Socially close.

  11. Architecture The architecture model of our classifier consists of three analyses • Step 1: DNS-based header analysis • Step 2: Social network analysis • Step 3: Wantedness analysis • Step 4: Classification

  12. Step 1: DNS-based Header Analysis Stage 1: In this step, we validate the information provided in the email header: the hostname position of the sender, the mail server and the relays in the rest of the path. We divide the entire corpus into two buckets. • The emails which are valid for DNS lookups (Bucket 1). • The emails which are not valid for DNS lookups (Bucket 2). Stage 2:This step involves doing DNS lookup on the hostname provided in the Received: lines of the header and matching the IP address returned, with the IP address which is stored next to the hostname, by the relays during the SMTP authorization process. Bucket 1 is further divided into: • Trusted bucket. • Untrusted bucket. We pass the Bucket2 and both trusted and untrusted buckets to the Social Network Analysis phase for further analysis.

  13. Step 2: Social Network Analysis Each of the three buckets: bucket2, untrusted bucket and trusted bucket received from the DNS-based header analysis are treated with the rules formulated by analyzing the “sent” folder emails of the receiver. For instance, • All emails from trusted domains will be removed • Familiarity to sender’s community • Familiarity to the path traversed The rules can be built as per the recipients’ email filtering preferences.

  14. Classification of Trusted and Untrusted Senders

  15. Step 3:Wantedness Analysis Measuring the senders credibility (ρ): • We believe the credibility of a sender depends on the nature of his recent emails • If the recent emails sent by the sender are legitimate, his credibility increases • If the recent emails from the sender are fraudulent, his fraudulency increases

  16. Credibility Drops As Time Progresses for Untrusted Senders

  17. Computing Credibility (ΔT legitimate emails) is the average time period of all legitimate email w.r.t the most recent email (ΔT fraudulent emails) is the average time period of all fraudulent emails w.r.t the most recent email

  18. Credibility of Untrusted Senders

  19. Measuring Recipient’s Wantedness • Tolerance (α+) for a sender is more if the recipient reads and stores his emails for longer period • Intolerance (β-) for a sender is more if the recipient deletes his emails with out reading them

  20. Measuring Wantedness (ΔT legitimate emails) is the average time period of all legitimate email w.r.t the most recent email (ΔT fraudulent emails) is the average time period of all fraudulent emails w.r.t the most recent email Trdis the average storage time period of all the read emails Turd is the average storage time period of all unread emails

  21. Wantedness of Trusted Senders

  22. Classification • Classification of Phishers: • Credibility Vs Phishing Frequency • Classification of Trusted Senders: • Credibility Vs Wantedness

  23. Classification of Phishers

  24. Classification of Trusted Senders

  25. Summary of Results Precision is the percentage of messages that were classified as phishing that actually are phishing

  26. Conclusions • Phishers use special software's to conceal the path taken by their emails to reach the recipient. Most of the times the path length is single hop. • Our classifier can be used in conjunction with any existing spam filtering techniques for restricting spam and phishing emails • Rather than labeling an email as BULK, based on the sender’s credibility and his wantedness, we further classify them as: • Prospective phishers • Suspects • Recent phishers • Serial phishers • We classified two different email corpuses with a precision of 98.4% and 99.2% respectively

More Related