Learning to Detect and Classify Malicious Executables in the Wild

Learning to Detect and Classify Malicious Executables in the Wild Reporter: 林佳宜 Email: M98570015@mail.ntou.edu.tw 2014/10/3 1 1

References Learning to Detect and Classify Malicious Executables in the Wild. J. Zico Kolter, Marcus A. Maloof, JMLR 2006. 2

Outline Introduction Classification Methodology Experimental Design Experimental Results Conclusion 3

Introduction • Malicious code can • cause harm or subvert the system’s intended function • Malicious executables have three categories • viruses, worms, and Trojan horses. • Describe the use of machine learning and data mining • detect and classify malicious executables 4

Three main contributions • Detect and classify malicious executables • Use text classification • Present empirical results • from an extensive study of inductive methods for detecting and classifying malicious executables • Show that the methods achieve high detection rates • even on completely new, previously unseen malicious executables 5

Several learning methods • Implemented in the Wakaito Environment for Knowledge Acquisition (WEKA) • IBk • naive Bayes • support vector machine (SVM) • J48 • Used the AdaBoost.M1 algorithm • boost SVMs, J48, naive Bayes 6

Data Collection • Gathered this collection early of 2003 • Benign executables • 1971 • from Windows 2000 and XP operating systems • SourceForge • download.com • Malicious executables • 1651 • from Web site VX Heavens • MITRE Corporation, the sponsors of this project • Recently,obtained 291 malicious executables • from VX Heavens 7

Experimental Design • To evaluate the approach and methods • stratified ten-fold cross-validation • randomly partitioned the executables into ten disjoint sets of equal size • one as a testing set • nine to form a training set • Extracted n-grams from the executables in the training and testing sets • Selected the most relevant features from the training data • To conduct ROC analysis, for each method 8

Detecting Malicious Executables • Learning methods detected malicious executables • three experimental studies • The first was a pilot study to determine the • size of words and n-grams • the number of n-grams relevant for prediction • The second experiment consisted of applying all of the classification methods to • a small collection of executables • The third then involved applying the methodology to • a larger collection of executables 9

Pilot Studies[1/2] • Pilot studies to determine three parameters • the size of n-grams • the size of words, • the number of selected features • Extracted bytes from • 476 malicious executables, 561 benign executables • produced n-grams, for n = 4 • Selected the best 10, 20, . . . , 100, 200, . . . , 1000, 2000, . . . , 10000 n-grams, • Selecting 500 n-grams produced the best results 10

Pilot Studies[2/2] • Fixed the number of n-grams • at 500 • varied n, the size of the n-grams • Evaluated the same methods for n=1,2,....,10 • n = 4 produced the best results • Varied the size of the words (one byte, two bytes, etc.) • single bytes produced better results 11

Classification Methodology • Form training examples • used the n-grams extracted from the executables • by viewing each n-gram as a Boolean attribute • Selected the most relevant attributes by • computing the information gain (IG) for each: • Selected the top 500 n-grams 12

Experiment with a Small Collection • Executables produced 68744909 distinct n-grams • Areas under these curves (AUC) with 95% confidence intervals • the boosted methods performed well • Naive Bayes did not perform as well 13

Experiment with a Larger Collection • This collection consisted of • 1971 benign executables • 1651 malicious executables • over 255 million distinct n-grams of size four • The areas under these curves with 95% confidence intervals • boosted J48 outperformed all other methods 16

Classifying Executables by Payload Function • Classify malicious executables based on • function of their payload • present results for three functional categories • opened a backdoor、 mass-mailed、executable virus • Reduce the previously undiscovered malicious executables 19

Evaluating Real-world, Online Performance • Compare the actual detection rates • larger collection VS the 291 new malicious • Selected three desired false-positive rates • 0.01, 0.05, 0.1 • Detected about 98% of the new malicious executables • boosted J48 • false-positive rate of 0.05 22

Conclusion Detecting and classifying unknown malicious executables by machine learning, data mining, text classification Detecting malicious executables boosted J48 produced the best detector with an area under the ROC curve of 0.996 Classify malicious executables based on payload’s function boosted J48 produced the best detectors with areas under the ROC curve around 0.9 24

Questions 25

Learning to Detect and Classify Malicious Executables in the Wild