160 likes | 470 Vues
SP.a.M / TØ Spamato An Extendable Spam Filter System by Keno Albrecht Nicolas Burri Roger Wattenhofer Motivation Countless number of different spam filters Google: 1,740,000 hits (not spam filters) Freshmeat/Sourceforge: 404/420 projects Several "once-only" research projects
E N D
SP.a.M/\TØ Spamato An Extendable Spam Filter System by Keno Albrecht Nicolas Burri Roger Wattenhofer
Motivation • Countless number of different spam filters • Google: 1,740,000 hits (not spam filters) • Freshmeat/Sourceforge: 404/420 projects • Several "once-only" research projects • Client-side filtering (vs. server-side) • Email Client Add-On: Outlook (Express), … • Proxy: Mediator between Client and Server • Stand-alone: Proprietary “email clients” Spamato - Keno Albrecht - Second Conference on Email and Anti-Spam - July 21 & 22, 2005
Project Goal • Build an extendable spam filter system to… • ease the development of filters; provide filter container • help implementing tools for common tasks • support as many email clients as possible • Encourage filter developers to use our framework Spamato - Keno Albrecht - Second Conference on Email and Anti-Spam - July 21 & 22, 2005
Subject: Free Spam Filter System To: developer@spamfilter.net From: keno@spamato.net Dear Spam Filter Developer, This is your once-in-a-lifetime opportunity to use the free spam filter system Spamato. Spamato aims to bring a practical, easy-to-use, and effective spam filter technology to the user’s desktop. It has been designed to be used primarily as an add-on for several email clients. The combination of multiple filtering techniques leads to a high spam detection rate and a low false-positive rate. It offers a variety of features that simplifies your life as a spam filter developer. Do not reinvent the wheel! Write your filter in an instance! Use Spamato! Visit our homepage at http://www.spamato.net. To unsubscribe click here. The Spamato-Team Spamato - Keno Albrecht - Second Conference on Email and Anti-Spam - July 21 & 22, 2005
Architecture • Depending on Add-on: • Visual Basic • Java Script • … • Java • platform independent Spamato - Keno Albrecht - Second Conference on Email and Anti-Spam - July 21 & 22, 2005
Filtering Process Emails are processed in five phases: (1) Initialization (2) Pre-Check (3) Check (4) Decision (5) Post-Check Spamato - Keno Albrecht - Second Conference on Email and Anti-Spam - July 21 & 22, 2005
Filtering Process(1) Initialization • Email client receives email, forwards it to Spamato, and waits for check result.
Filtering Process(2) Pre-Check • Veto against further processing (Configuration, Sender-whitelist) • Gain information for other plugins (URL extractor)
Filtering Process(3) Check • Each filter calculates the spam probability
Filtering Process(4) Decision • The overall spam probability is calculated and returned to the email client
Filtering Process(5) Post-Check • Learn from global decision • Collect statistics • Play sound
Filters • Bayesianato: Naïve Bayesian-based filter • Ruleminator: Rule-based filter • Razor(Ephemeral): Hash-based filter • Vipul’s Razor: http://razor.sourceforge.net • URL-based filters: • Domainator: Search engine (“Google”) filter • Earlgrey: Our collaborative multi-domain filter • Razor(Whiplash): Collaborative single-domain filter Spamato - Keno Albrecht - Second Conference on Email and Anti-Spam - July 21 & 22, 2005
URL/URI/Domain Filtering • About 70,000 spam emails investigated • ~76% with at least one domains, thereof… • ~20% with more than one distinct domain • ~2% with ten or more distinct domains • Spammers obfuscate their messages for the (sole) purpose of misleading URL filters! • How to handle “fake” (including ham) domains? How to find “spam” domains? Spamato - Keno Albrecht - Second Conference on Email and Anti-Spam - July 21 & 22, 2005
URL-Filters in Comparison 26.5% (1.1%) of all spam messages were identified by the Domainator, but not by the Earlgrey (Razor/Whiplash) filter. 27.3% of all messages were not identified by the Domainator, and 0.6% of all spam messages were solely identified by it. Spamato - Keno Albrecht - Second Conference on Email and Anti-Spam - July 21 & 22, 2005
Conclusion & Future Work • Spamato eases the implementation and deployment of spam filters and tools. It can be used with all email clients. It is open source. • A multi-faceted (URL-) filtering approach is reasonable. • TODO: • Integration of more filters and improved analysis tools • Decision module (dynamic weighting of filter results) • Trust system for collaborative filters Spamato - Keno Albrecht - Second Conference on Email and Anti-Spam - July 21 & 22, 2005
Thank you! Questions? Comments? • (Un)subscribe? • kenoa@tik.ee.ethz.ch • keno@spamato.net • http://www.spamato.net • http://sf.net/projects/spamato Spamato - Keno Albrecht - Second Conference on Email and Anti-Spam - July 21 & 22, 2005