Fast Portscan Detection Using Sequential Hypothesis Testing

Fast Portscan Detection Using Sequential Hypothesis Testing Authors: Jaeyeon Jung, Vern Paxson, Arthur W. Berger, and Hari Balakrishnan Publication: IEEE Symposium on Security and Privacy 2004 Presenter: Ryan Cunningham

A quick note • All images and equations taken directly from the publication

Port scanning • Network reconnaissance technique • Usually a prelude to an attack • Difficult to detect • Traffic difficult to distinguish from regular traffic • Stealth scans can occur very slowly • Some scans are legitimate • Search engine spiders • SSH, peer-to-peer applications, etc.

Previous detection techniques • Limit distinct connection attempts from one IP • Network Security Monitor • Snort • Also detects malformed packets • Limit failed connection attempts from one IP • Bro • Sensitive to service on specific port • Robertson et al. showed threshold very important

Previous detection techniques • Probabilistic model • Developed by Leckie et al. • Assesses typical traffic a machine receives • Also assesses the traffic a remote machine is likely to send • Combines these probabilities • If the result is too much, an alert is sounded • Generates too many false positives

Previous detection techniques • SPICE • Similar to probabilistic model • Used to detect low traffic “stealth” scans • Too computationally intensive for real world

Data set • Traffic from two sites • LBL • 6,000 hosts • Sparse address space 4.4% • ICSI • 200 hosts • Dense address space 42%

Data set • Anonymized TCP logs from Bro • Recorded for one 24 hour period • Bro NIDS flags for comparison and validation

Data set • Unsuccessful Login attempt analysis

Data set • Ratio of successful login attempts to unsuccessful login attempt analysis

Observations • Scans usually come from one host • Scans make lots of failed connection attempts and few successful connection attempts • Scans should ideally be detected quickly • False positive rate should be configurable

Sequential Hypothesis Testing • Proposed by Wald in the 1940’s • Method of doing repeated hypothesis testing as sequential data is gathered • Deciding between two hypotheses • Each time a data point arrives, decide • Accept H0 (in our case, benign traffic) • Accept H1 (in our case, port scan traffic) • Wait for more data (next connection attempt)

Sequential Hypothesis Testing • We specify parameters a and b • a> false positive rate • b< detection accuracy • We must estimate parameters q0 and q1 • q0 probability a benign connection attempt is successful • q1 probability a scanner connection attempt is successful

Sequential Hypothesis Testing • For each test, we compute the likelihood ratio: • Where

Sequential Hypothesis Testing • Compare likelihood ratio to: • If • L < h0 then this is benign traffic • L > h1 then this is scan traffic • Otherwise, wait for another connection

Sequential Hypothesis Testing • We can estimate the expected number of connections required to decide with: • Derivation is long and messy

Sequential Hypothesis Testing

Algorithm

Results • Efficiency = true positive / total reported positive • Effectiveness = true positive / total actually positive

Results • Comparison with Snort and Bro • N bar = average number of local hosts scanned before decision is made

Contributions • Extremely fast port scan detection algorithm • High accuracy • Low false positive rate • Sound statistical foundation • Soundly evaluate the weaknesses of their approach • Good use of appendixes • Cure for insomnia

Weaknesses • Buffer of activity • Attacker can spoof multiple IP addresses • How is filled buffer dealt with? • Flush buffer • Attacker can use this to hide scan activity • Maintain larger buffer • Attacker can keep going until system crashes • Distributed port scans undetectable • Botnets are increasing in popularity

Weaknesses • Test assumes independent connection attempts • As suggested in paper, an attacker could exploit knowledge of the system to connect to some systems while doing surveillance on others • No real time testing conducted, only simulation • Reasoning is a little circular • Poor use of language

Improvements • Implement and test in real time • Perform suggested improvements in paper • Differentiate between different services • Differentiate between rejected and unanswered connection attempts • Use a honeypot to see if complete three way hand shake is completed (to detect spoofed IPs) • Should have kept some of the data away as a sort of test data set

Fast Portscan Detection Using Sequential Hypothesis Testing

Fast Portscan Detection Using Sequential Hypothesis Testing

Presentation Transcript

Hypothesis Testing

Sequential Hypothesis Testing under Stochastic Deadlines

Testing Hypothesis

Hypothesis Testing

Hypothesis Testing

Hypothesis Testing

Hypothesis Testing:

Hypothesis testing

Hypothesis Testing

Hypothesis Testing

Hypothesis Testing

Hypothesis testing

Hypothesis Testing

Hypothesis Testing

Hypothesis Testing

Hypothesis Testing

Hypothesis testing

Hypothesis Testing

Hypothesis testing

Hypothesis Testing

Fast Port Scan Detection Using Sequential Hypotheses Testing *

Sequential Hypothesis Testing under Stochastic Deadlines