240 likes | 365 Vues
This paper presents a novel approach for detecting port scans through Sequential Hypothesis Testing. Port scanning is a common reconnaissance technique used by attackers to identify vulnerabilities in target networks, making detection crucial. Traditional methods often generate high false positive rates and struggle with stealthy scans. This research introduces an efficient algorithm that leverages statistical foundations to detect port scan traffic while maintaining low false positive rates. The methodology is validated against established systems, demonstrating significant improvements in accuracy and efficiency.
E N D
Fast Portscan Detection Using Sequential Hypothesis Testing Authors: Jaeyeon Jung, Vern Paxson, Arthur W. Berger, and Hari Balakrishnan Publication: IEEE Symposium on Security and Privacy 2004 Presenter: Ryan Cunningham
A quick note • All images and equations taken directly from the publication
Port scanning • Network reconnaissance technique • Usually a prelude to an attack • Difficult to detect • Traffic difficult to distinguish from regular traffic • Stealth scans can occur very slowly • Some scans are legitimate • Search engine spiders • SSH, peer-to-peer applications, etc.
Previous detection techniques • Limit distinct connection attempts from one IP • Network Security Monitor • Snort • Also detects malformed packets • Limit failed connection attempts from one IP • Bro • Sensitive to service on specific port • Robertson et al. showed threshold very important
Previous detection techniques • Probabilistic model • Developed by Leckie et al. • Assesses typical traffic a machine receives • Also assesses the traffic a remote machine is likely to send • Combines these probabilities • If the result is too much, an alert is sounded • Generates too many false positives
Previous detection techniques • SPICE • Similar to probabilistic model • Used to detect low traffic “stealth” scans • Too computationally intensive for real world
Data set • Traffic from two sites • LBL • 6,000 hosts • Sparse address space 4.4% • ICSI • 200 hosts • Dense address space 42%
Data set • Anonymized TCP logs from Bro • Recorded for one 24 hour period • Bro NIDS flags for comparison and validation
Data set • Unsuccessful Login attempt analysis
Data set • Ratio of successful login attempts to unsuccessful login attempt analysis
Observations • Scans usually come from one host • Scans make lots of failed connection attempts and few successful connection attempts • Scans should ideally be detected quickly • False positive rate should be configurable
Sequential Hypothesis Testing • Proposed by Wald in the 1940’s • Method of doing repeated hypothesis testing as sequential data is gathered • Deciding between two hypotheses • Each time a data point arrives, decide • Accept H0 (in our case, benign traffic) • Accept H1 (in our case, port scan traffic) • Wait for more data (next connection attempt)
Sequential Hypothesis Testing • We specify parameters a and b • a> false positive rate • b< detection accuracy • We must estimate parameters q0 and q1 • q0 probability a benign connection attempt is successful • q1 probability a scanner connection attempt is successful
Sequential Hypothesis Testing • For each test, we compute the likelihood ratio: • Where
Sequential Hypothesis Testing • Compare likelihood ratio to: • If • L < h0 then this is benign traffic • L > h1 then this is scan traffic • Otherwise, wait for another connection
Sequential Hypothesis Testing • We can estimate the expected number of connections required to decide with: • Derivation is long and messy
Results • Efficiency = true positive / total reported positive • Effectiveness = true positive / total actually positive
Results • Comparison with Snort and Bro • N bar = average number of local hosts scanned before decision is made
Contributions • Extremely fast port scan detection algorithm • High accuracy • Low false positive rate • Sound statistical foundation • Soundly evaluate the weaknesses of their approach • Good use of appendixes • Cure for insomnia
Weaknesses • Buffer of activity • Attacker can spoof multiple IP addresses • How is filled buffer dealt with? • Flush buffer • Attacker can use this to hide scan activity • Maintain larger buffer • Attacker can keep going until system crashes • Distributed port scans undetectable • Botnets are increasing in popularity
Weaknesses • Test assumes independent connection attempts • As suggested in paper, an attacker could exploit knowledge of the system to connect to some systems while doing surveillance on others • No real time testing conducted, only simulation • Reasoning is a little circular • Poor use of language
Improvements • Implement and test in real time • Perform suggested improvements in paper • Differentiate between different services • Differentiate between rejected and unanswered connection attempts • Use a honeypot to see if complete three way hand shake is completed (to detect spoofed IPs) • Should have kept some of the data away as a sort of test data set