Hamsa: Fast Signature Generation for Zero-day Polymorphic Worms with Provable Attack Resilience

Hamsa: Fast Signature Generation for Zero-day Polymorphic Wormswith Provable Attack Resilience Zhichun Li, Manan Sanghi, Yan Chen, Ming-Yang Kao and Brian Chavez Lab for Internet & Security Technology (LIST)Northwestern University

The Spread of Sapphire/Slammer Worms

Desired Requirements for Polymorphic Worm Signature Generation • Network-based signature generation • Worms spread in exponential speed, to detect them in their early stage is very crucial… However • At their early stage there are limited worm samples. • The high speed network router may see more worm samples… But • Need to keep up with the network speed ! • Only can use network level information

Desired Requirements for Polymorphic Worm Signature Generation • Noise tolerant • Most network flow classifiers suffer false positives. • Even host based approaches can be injected with noise. • Attack resilience • Attackers always try to evade the detection systems • Efficient signature matching for high-speed links No existing work satisfies these requirements !

Outline • Motivation • Hamsa Design • Model-based Signature Generation • Evaluation • Related Work • Conclusion

Choice of Signatures • Two classes of signatures • Content based • Token: a substring with reasonable coverage to the suspicious traffic • Signatures: conjunction of tokens • Behavior based • Our choice: content based • Fast signature matching. ASIC based approach can archive 6 ~ 8Gb/s • Generic, independent of any protocol or server

Invariants Unique Invariants of Worms • Protocol Frame • The code path to the vulnerability part, usually infrequently used • Code-Red II: ‘.ida?’ or ‘.idq?’ • Control Data: leading to control flow hijacking • Hard coded value to overwrite a jump target or a function call • Worm Executable Payload • CLET polymorphic engine: ‘0\x8b’, ‘\xff\xff\xff’ and ‘t\x07\xeb’ • Possible to have worms with no such invariants, but very hard

Hamsa Architecture

Hamsa Design • Key idea: model the uniqueness of worm invariants • Greedy algorithm for finding token conjunction signatures • Highly accurate while much faster • Both analytically and experimentally • Compared with the latest work, polygraph • Suffix array based token extraction • Provable attack resilience guarantee • Noise tolerant

Hamsa Signature Generator • Core part: Model-based Greedy Signature Generation • Iterative approach for multiple worms

Maximize the coverage in the suspicious pool Suspicious pool Normal pool False positive in the normal pool is bounded by r Problem Formulation Signature Generator Signature false positive bound r With noise NP-Hard!

t1 t2 Joint FP with t1 FP 21% 2% 9% 0.5% 17% 1% 5% Model Uniqueness of Invariants U(1)=upper bound of FP(t1) U(2)=upper bound of FP(t1,t2) The total number of tokens bounded by k*

(COV, FP) (82%, 50%) (70%, 11%) (67%, 30%) (62%, 15%) (50%, 25%) (41%, 55%) (36%, 41%) (12%, 9%) Signature Generation Algorithm token extraction t1 u(1)=15% tokens Suspicious pool Order by coverage

(COV, FP) (COV, FP) (82%, 50%) (69%, 9.8%) (68%, 8.5%) (70%, 11%) (67%, 1%) (67%, 30%) (40%, 2.5%) (62%, 15%) (35%, 12%) (50%, 25%) (41%, 55%) (31%, 9%) (36%, 41%) (10%, 0.5%) (12%, 9%) Signature Generation Algorithm Signature t1 t2 u(2)=7.5% Order by joint coverage with t1

Algorithm Analysis • Runtime analysis O(T*(|M|+|N|)) • Provable Attack Resilience Guarantee • Analytically bound the worst attackers can do! • Example: K*=5, u(1)=0.2, u(2)=0.08, u(3)=0.04, u(4)=0.02, u(5)=0.01 and r=0.01 • The better the flow classifier, the lower are the false negatives

Attack Resilience Assumptions • Two Common assumptions for any sig generation sys • Two Unique assumptions for token-based schemes • Attacks to the flow classifier • Our approach does not depend on perfect flow classifiers • With 99% noise, no approach can work! • High noise injection makes the worm propagate less efficiently. • Enhance flow classifiers

Improvements to the Basic Approach • Generalizing Signature Generation • use scoring function to evaluate the goodness of signature • Iteratively use single worm detector to detect multiple worms • At the first iteration, the algorithm find the signature for the most popular worms in the suspicious pool. • All other worms and normal traffic treat as noise.

Experiment Methodology • Experiential setup: • Suspicious pool: • Three pseudo polymorphic worms based on real exploits (Code-Red II, Apache-Knacker and ATPhttpd), • Two polymorphic engines from Internet (CLET and TAPiON). • Normal pool: 2 hour departmental http trace (326MB) • Signature evaluation: • False negative: 5000 generated worm samples per worm • False positive: • 4-day departmental http trace (12.6 GB) • 3.7GB web crawling including .mp3, .rm, .ppt, .pdf, .swf etc. • /usr/bin of Linux Fedora Core 4

Results on Signature Quality • Single worm with noise • Suspicious pool size: 100 and 200 samples • Noise ratio: 0%, 10%, 30%, 50%, 70% • Noise samples randomly picked from the normal pool • Always get above signatures and accuracy. • Multiple worms with noises give similar results

Speed Results • Implementation with C++/Python • 500 samples with 20% noise, 100MB normal traffic pool, 15 seconds on an XEON 2.8Ghz, 112MB memory consumption • Speed comparison with Polygraph • Asymptotic runtime: O(T) vs. O(|M|2), when |M| increase, T won’t increase as fast as |M|! • Experimental: 64 to 361 times faster (polygraph vs. ours, both in python)

Related works

Conclusion • Network based signature generation and matching are important and challenging • Hamsa: automated signature generation • Fast • Noise tolerant • Provable attack resilience • Capable of detecting multiple worms in a single application protocol • Proposed a model to describe the worm invariants

Questions ?

Experiment: Sample requirement • Coincidental-pattern attack [Polygraph] • Results • For the three pseudo worms, 10 samples can get good results. • CLET and TAPiON at least need 50 samples • Conclusion • For better signatures, to be conservative, at least need 100+ samplesRequire scalable and fast signature generation!

Experiment: U-bound evaluation • To be conservative we chose k*=15. • Even we assume every token has 70% false positive, their conjunction still only have 0.5% false positive. In practice, very few tokens exceed 70% false positive. • Define u(1) and ur, generate • We tested:u(1) = [0.02, 0.04, 0.06, 0.08, 0.10, 0.20, 0.30, 0.40, 0.5] and ur = [0.20, 0.40, 0.60, 0.8]. The minimum (u(1), ur) works for all our worms was (0.08,0.20) • In practice, we use conservative value (0.15,0.5)

Results on Signature Quality (II) • Suspicious pool with high noise ratio: • For noise ratio 50% and 70%, sometimes we can produce two signatures, one is the true worm signature, anther solely from noise. • The false positive of these noise signatures have to be very small: • Mean: 0.09% • Maximum: 0.7% • Multiple worms with noises give similar results

Attack Resilience Assumptions • Common assumptions for any sig generation sys • The attacker cannot control which worm samples are encountered by Hamsa • The attacker cannot control which worm samples encountered will be classified as worm samples by the flow classifier • Unique assumptions for token-based schemes • The attacker cannot change the frequency of tokens in normal traffic • The attacker cannot control which normal samples encountered are classified as worm samples by the worm flow classifier

Normal Traffic Poisoning Attack • We found our approach is not sensitive to the normal traffic pool used • History: last 6 months time window • The attacker has to poison the normal traffic 6 month ahead! • 6 month the vulnerability may have been patched! • Poisoning the popular protocol is very difficult.

Red Herring Attack • Hard to implement • Dynamic updating problem. Again our approach is fast • Partial Signature matching, in extended version.

Coincidental Attack • As mentioned in the Polygraph paper, increase the sample requirement • Again, our approach are scalable and fast

Model Uniqueness of Invariants • Let worm has a set of invariants:Determine their order by: t1: the token with minimum false positive in normal traffic. u(1) is the upper bound of the false positive of t1 t2: the token with minimum joint false positive with t1 FP({t1,t2}) bounded by u(2) ti: the token with minimum joint false positive with {t1, t2, ti-1}. FP({t1,t2,…,ti}) bounded by u(i) The total number of tokens bounded by k*

Problem Formulation Noisy Token Multiset Signature Generation Problem :INPUT: Suspicious pool M and normal traffic pool N; value r<1.OUTPUT: A multi-set of tokens signature S={(t1, n1), . . . (tk, nk)} such that the signature can maximize the coverage in the suspicious pool and the false positive in normal pool should less than r • Without noise, exist polynomial time algo • With noise, NP-Hard

Token-fit Attack Can Fail Polygraph • Polygraph: hierarchical clustering to find signatures w/ smallest false positives • With the token distribution of the noise in the suspicious pool, the attacker can make the worm samples more like noise traffic • Different worm samples encode different noise tokens • Our approach can still work!

Noise samples Worm samples N1 W1 N2 W2 N3 W3 Merge Candidate 3 Merge Candidate 2 Merge Candidate 1 Token-fit attack could make Polygraph fail CANNOT merge further!NO true signature found!

Generalizing Signature Generation with noise • BEST Signature = Balanced Signature • Balance the sensitivity with the specificity • But how? Create notation Scoring function:score(cov, fp, …) to evaluate the goodness of signature • Current used • Intuition: it is better to reduce the coverage 1/a if the false positive becomes 10 times smaller. • Add some weight to the length of signature (LEN) to break ties between the signatures with same coverage and false positive

Generalizing Signature Generation with noise • Algorithm: similar • Running time: same as previous simple form • Attack Resilience Guarantee: similar

Extension to multiple worm • Iteratively use single worm detector to detect multiple worm • At the first iteration, the algorithm find the signature for the most popular worms in the suspicious pool. All other worms and normal traffic treat as noise. • Though the analysis for the single worm can apply to multiple worms, but the bound are not very promising. Reason: high noise ratio

Implementation details • Token Extraction: extracta set of tokens with minimum length l and minimum coverage COVmin. • Polygraph use suffix tree based approach: 20n space and time consuming. • Our approach: Enhanced suffix array 8n space and much faster! (at least 20 times) • Calculate false positive when check U-bounds • Again suffix array based approach, but for a 300MB normal pool, 1.2GB suffix array still large! • Optimization: using MMAP, memory usage: 150 ~ 250MB

Token Extraction • Extracta set of tokens with minimum length lmin and coverage COVmin. And for each token output the frequency vector. • Polygraph use suffix tree based approach: 20n space and time consuming. • Our approach: • Enhanced suffix array 4n space • Much faster, at least 50(UPDATE) times! • Can apply to Polygraph also.

Calculate the false positive • We need to have the false positive to check the U-bounds • Again suffix array based approach, but for a 300MB normal pool, 1.2GB suffix array still large! • Improvements • Caching • MMAP suffix array. True memory usage: 150 ~ 250MB. • 2 level normal pool • Hardware based fast string matching • Compress normal pool and string matching algorithms directly over compressed strings

Future works • Enhance the flow classifiers • Cluster suspicious flows by return messages • Malicious flow verification by replaying to Address Space Randomization enabled servers.

Experiment: Attacks • We propose a new attack: token-fit. • The attacker may study the noise inside the suspicious pool • Create worm sample Wi which may has more same tokens with some normal traffic noise sample Ni • This will stuck the hierarchical clustering used in [Polygraph] • BUT We still can generate correct signature!

Hamsa: Fast Signature Generation for Zero-day Polymorphic Worms with Provable Attack Resilience