Summary from the Last Lecture

Summary from the Last Lecture • Looked at some research approaches to: • Evaluate defense effectiveness • Stop worm from spreading from a given host • Defend a circle of friends against worms • Detect the worm early • Slow-down and impair worm propagation

DOMINO • The goal is to build an overlay network so that nodes cooperatively detect intrusion activity • Cooperation reduces the number of false positives • Overlay can be used for worm detection • Main feature are active-sink nodes that detect traffic to unused IP addresses • The reaction is to build blacklists of infected nodes V. Yegneswaran, P. Barford, S. Jha, “Global Intrusion Detection in the DOMINOOverlay System,” NDSS 2004

DOMINO Architecture

DOMINO Architecture • Axis nodes collect, aggregate and share data • Nodes in large, trustworthy ISPs • Each node maintains a NIDS and an active sink over large portion of unused IP space • Access points grant access to axis nodes after thorough administrative checks • Satellite nodes form trees below an axis node, collect information and deliver it to axis nodes and pull relevant information • Terrestrial nodes supply daily summaries of port scan data

Information Sharing • Every axis node maintains a global and local view of intrusion activity • Periodically a node receives summaries from peers which are used to update global view • List of worst offenders grouped per port • Lists of top scanned ports • RSA is used to authenticate nodes and signed SHA digests are used to ensure message integrity and authenticity

How Many Nodes We Need? 40 for port summaries 20 for worst offender list

How Frequent Info Exchange? Staleness doesn’t matter much but more frequent lists are better to catch worst offenders

How Long Blacklists? About 1000 IPs are enough

How Close Monitoring Nodes? Blacklists in same /16 space are similar  satellites in /16 space should be groupedunder the same axis node and sets of /16 spaces should be randomly distributed among different axis nodes

SQL Snake Experiments • Slow worm propagated in May 2002 • Nodes exchange reports hourly • Alarm is raised if 20% or more nodes vote for an alarm • A node votes if all of these hold: • 200% increase in number of scans from hourly average • 100% increase in sources from hourly average • Number of sources > 5

SQL Snake Reaction Time Almost zero

SQL Slammer Experiments • Extremely fast worm  periodic information exchange will not be enough • We need spontaneous alerts issued through triggers • A trigger is issued if it holds: • Number of sources > 5, and • Rule 1: Number of scans is 10 times the average, or • Rule 2: Number of sources is 10 times the average, or • Rule 3: The duration of anomalous event (horizontal, vertical or coordinated scan) is 10 times the average • Detection is called if more than 10% (Rule1), 20% (Rule 2) or 30% (Rule 3) nodes issue alerts

SQL Slammer Reaction Time About 80-100 class C subnets are enough

Automatic Worm Signatures • Focus on TCP worms that propagate via scanning • Idea: vulnerability exploit is not easily mutable so worm packets should have some common signature • Step 1: Select suspicious TCP flows using heuristics • Step 2: Generate signatures using content prevalence analysis Kim, H.-A. and Karp, B., Autograph: Toward Automated, Distributed Worm Signature Detection, in the Proceedings of the 13th Usenix Security Symposium (Security 2004), San Diego, CA, August, 2004.

Suspicious Flows • Detect scanners as hosts that make many unsuccessful connection attempts (>2) • Select their successful flows as suspicious • Build suspicious flow pool • When there’s enough flows inside trigger signature generation step

Signature Generation • Use most frequent byte sequences across flows as the signature • Naïve techniques fail at byte insertion, deletion, reordering • Content-based payload partitioning (COPP) • Partition if Rabin fingerprint of a sliding window matches breakmark= content blocks • Configurable parameters: window size, breakmark • Analyze which content blocks appear most frequently and what is the smallest set of those that covers most/all samples in suspicious flow pool

How Well Does it Work? • Tested on traces of HTTP traffic interlaced with known worms • For large block sizes and large coverage of suspicious flow pool (90-95%) Autograph performs very well • Small false positives and false negatives

Distributed Autograph • Would detect more scanners • Would produce more data for suspicious flow pool • Reduce false positives and false negatives

Automatic Signatures (approach 2) • Detect content prevalence • Some content may vary but some portion of worm remains invariant • Detect address dispersion • Same content will be sent from many hosts to many destinations • Challenge: how to detect these efficiently (low cost = fast operation) S.Singh, C. Estan, G. Varghese and S. Savage “Automated Worm Fingerprinting,” OSDI 2004

Content Prevalence Detection • Hash content + port + proto and use this as key to a table where counters are kept • Content hash is calculated over overlapping blocks of fixed size • Use Rabin fingerprint as hash function • Autograph calculates Rabin fingerprint over variable-length blocks that are non-overlapping

Address Dispersion Detection • Remembering sources and destinations for each content would require too much memory • Scaled bitmap: • Sample down input space, e.g., hash into values 0-63 but only remember those values that hash into 0-31 • Set the bit for the output value (out of 32 bits) • Increase sampling-down factor each time bitmap is full = constant space, flexible counting

How Well Does This Work? • Implemented and deployed at UCSD network

How Well Does This Work? • Some false positives • Spam, common HTTP protocol headers .. (easily whitelisted) • Popular BitTorrent files (not easily whitelisted) • No false negatives • Detected each worm outbreak reported in news • Cross-checked with Snort’s signature detection

Polymorphic Worm Signatures • Insight: multiple invariant substrings must be present in all variants of the worm for the exploit to work • Protocol framing (force the vulnerable code down the path where the vulnerability exists) • Return address • Substrings not enough = too short • Signature: multiple disjoint byte strings • Conjunction of byte strings • Token subsequences (must appear in order) • Bayes-scored substrings (score + threshold) J. Newsome, B. Karp and D. Song, “Polygraph: Automatically Generating Signatures for Polymorphic Worms,”IEEE Security and Privacy Symposium, 2005

Worm Code Structure • Invariant bytes: any change makes the worm fail • Wildcard bytes: any change has no effect • Code bytes: Can be changed using some polymorphic technique and worm will still work • E.g., encryption

Polygraph Architecture • All traffic is seen, some is identified as part of suspicious flows and sent to suspicious traffic pool • May contain some good traffic • May contain multiple worms • Rest of traffic is sent to good traffic pool • Algorithm makes a single pass over pools and generates signatures

Signature Detection • Extract tokens (variable length) that occur in at least K samples • Conjuction signature is this set of tokens • To find token-subsequence signatures samples in the pool are aligned in different ways (shifted left or right) so that the maximum-length subsequences are identified • Contiguous tokens are preferred • For Bayes signatures for each token a probability is computed that it is contained by a good or a suspicious flow – use this as a score • Set high value of threshold to avoid false positives

How Well Does This Work? • Legitimate traffic traces: HTTP and DNS • Good traffic pool • Some of this traffic mixed with worm traffic to model imperfect separation • Worm traffic: Ideally-polymorphic worms generated from 3 known exploits • Various tests conducted

How Well Does This Work? • When compared with single signature (longest substring) detection, all proposed signatures result in lower false positive rates • False negative rate is always zero if the suspicious pool has at least three samples • If some good traffic ends up in suspicious pool • False negative rate is still low • False positive rate is low until noise gets too big • If there are multiple worms in suspicious pool and noise • False positives and false negatives are still low

Summary from the Last Lecture

Summary from the Last Lecture

Presentation Transcript

Last lecture summary

Last lecture summary

Summary From the Last Lecture

Last lecture summary

Summary From the Last Lecture

Summary From the Last Lecture

Summary From the Last Lecture

Last lecture summary

Summary From the Last Lecture

Last lecture summary

Last lecture summary

Last lecture summary

Last lecture - summary

Last lecture summary

Last lecture summary

Summary From the Last Lecture

Summary From the Last Lecture

Summary From the Last Lecture

Summary From the Last Lecture

Summary From the Last Lecture

Last lecture summary

Last lecture summary