Graph Techniques for Malware Detection

# Graph Techniques for Malware Detection

Télécharger la présentation

## Graph Techniques for Malware Detection

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
##### Presentation Transcript

1. Graph Techniques for Malware Detection • Mark Stamp Graph Techniques

2. Pre-Intro Graph Techniques • A lot of malware-related research uses graph techniques • Here, we consider 3 research papers • All use graphs for malware detection • And they are very different approaches • There are many good project topics • So pay attention…

3. Intro Graph Techniques • Many graphs defined on software • We consider only a few examples here • Can use such graphs to compare code • I.e., scores can be defined for graphs • Graph can serve as a code signature • Might even identify metamorphic family • A lot of work done in this area • But still plenty of good open problems

4. Code Graphs Graph Techniques • We consider the following types of software-based graphs • Control flow graphs • Function call graphs • Opcode graphs • But first, discuss graphs in general • Then graph techniques for malware • Then consider 3 papers in some detail

5. Graphs Graph Techniques • Graph consists of set of vertices (or nodes) and a set of edges • An edge connects a pair of vertices • Edges can be directed or undirected • Directed edges go in one direction • Undirected edges, both directions • Edges sometimes include weights • Weights are sometimes probabilities

6. Graphs Graph Techniques • Graph specified as G = (V,E) • Where V is set of vertices • And E is set of edges  pairs of vertices • Undirected graph, E is unordered pairs • Directed graph, E is ordered pairs • Graph theory has many applications • Many general results depend on matrices derived from graphs

7. Examples of Graphs 0.2 0.3 0.5 1.0 Graph Techniques • Undirected graph • Vertices are circles (may be labeled or not) • Edges are lines • Weighted directed graph • Edges are arrows • Edge labels are “weights” • Weights often probability

8. Graphs and Matrices Graph Techniques • Adjacency matrix A = {aij} • Where aij = 1 if edge from node i to j • Otherwise, aij = 0 • For undirected graph, A is symmetric • Incidence matrix B = {bij} • Where bij = 1 if vertex i incident edge j • Otherwise bij = 0 • Lots more graphs related to matrices

9. Graphs and Matrices Graph Techniques • Since we have matrix representations… • We can apply linear algebra to graphs • Example 1 • Let A be adjacency matrix of graph G • Note that A is a square matrix • Consider the nth power of A, that is, An • Element (i,j) of An is number of paths of length n from vertex i to j in G

10. Graphs and Matrices Graph Techniques • Example 2 • Graph G is connected if any vertex can be reached from any other vertex • If G is connected, with n vertices, the rank of its incidence matrix is n – 1 • Other interesting results involve eigenvalues, eigenvectors, etc.

11. How to Compare Graphs? Graph Techniques • Graphs are isomorphic if we can relabel vertices of one to obtain the other • Implies “structure” is the same • Computational complexity of graph isomorphism is unknown (but it’s hard) • We’ll need to score graphs G and H • That is, measure similarity of G and H • And the score must be easy to compute • We’ll see examples later…

12. Control Flow Graph Graph Techniques • Nodes are “basic blocks” • Only entry at top, only exit at bottom • Control flow useful for optimization • E.g., remove (simple) dead code

13. Control Flow Graphs Graph Techniques • Consider dead code… • We can make dead code harder to detect via control flow analysis (How?) • Why? To obfuscate malware wrt control flow analysis • Such dead code is an easy obfuscation • Result? Control flow analysis can be defeated by reasonably advanced malware • Bottom line: Not a cure-all for malware

14. Function Call Graph Graph Techniques • Like a higher-level control flow • Not focused on basic blocks • For example, based on function calls • Previously applied to metamorphic malware detection • Good results and seems fairly robust • But overly complex scoring (IMHO) • Project: Break it and/or improve on it • Details coming soon…

15. Opcode Graph Graph Techniques • Several possible opcode graphs • We consider case where nodes are opcodes, edges are possible transitions • And edge-weights are probabilities • Note: Graph represents digram statistics • How to compare 2 such graphs? • This is the interesting question • We’ll use a simple, effective method • This research considered next…

16. Opcode Graph Similarity and Metamorphic Detection • NehaRunwal • Richard M. Low • Mark Stamp Function Call Graph

17. Intro Graph Techniques • A previous paper considered opcode graph analysis for malware detection • Approach was successful • But technique seems overly complex given that graph structure is very simple • Applied to fairly ordinary (polymorphic) malware, not metamorphic families • We want to test score simplifications, and consider metamorphic malware

18. Software Similarity Graph Techniques • Metamorphic detection can be based on measuring software similarity • Lots of related previous research • HMM, chi-squared, simple substitution distance, n-gram, etc., etc. • We use HMM approach as benchmark • Recall HMMs and related malware detection results

19. Previous Work Graph Techniques • Construct a “Markov graph” • Based on digraph frequencies • Use SVM for classification • Requires selecting kernel function • They combine 2 standard kernels • Claim that effectively compares both local and global graph structure • Compare results to n-gram analysis

20. Previous Work Graph Techniques • Collect data dynamically using “Ether” • I.e., extract opcodes on executed path • So polymorphism is no defense • Construct “Markov chain graph” • Actually, a very, very simple graph • SVM kernel function • Combine Gaussian and spectral kernels • Requires eigenvector computations • Consequently, efficiency is not very good

21. Opcode Graph Graph Techniques • Opcode graph based on digraphs • In effect, probability opcodeA is followed by opcodeB • Yields a simple, easy to construct graph • Question: How to compute scores? • That is, how to measure graph similarity? • First, we consider an example that illustrates opcode graph construction

22. Example Code Graph Techniques Assembly code

23. Consecutive Pairs Graph Techniques • Counts • PUSH then MOV occurs twice • MOV then PUSH once

24. Edge Weights Graph Techniques • Relative frequency • Normalized per row • Why per row?

25. Example as Opcode Graph Graph Techniques

26. Previous Work (Again) Graph Techniques • Recall that the previous work used SVM to classify opcode graphs • And efficiency was not impressive • Here, we want a faster method • We ignore time required for disassembly • We also consider metamorphic code • Previous work focused on Netbull • Netbull is trojan, backdoor

27. Previous Work Graph Techniques • Results from previous paper… • Note low FP, high FN for AV products • What’s up with that?

28. Comparing Opcode Graphs Graph Techniques • We use a much simpler approach • Instead of SVM/graph kernel… • Compare opcode graphs directly • Also, we consider metamorphic malware • How to directly compare graphs? • Opcode graph is extremely simple graph • Making a direct comparison possible • SVM is “heavy artillery” for such graphs

29. Opcode Graph Score Graph Techniques • Let A and Bbe opcode graphs • Map opcodes to 1,2,…,N • Let A={aij} be edge-weight matrix for A • Then aij is probability next opcode is j, given that current opcode is i • Let B={bij} be edge-weight matrix of B • Both A and B are NxN matrices • Corresponding vertices easy to match up

30. Opcode Graph Score Graph Techniques • Define score(A,B) = (Σ|aij – bij|)2/N2 • Where sum is i=1,2,…,N and j=1,2,…,N • If A = B, then score(A,B) = 0 • If aij = 1 and bik = 1 for j ≠ k, then Σ|aij – bij| = 2= maximum row sum • Implies score(A,B) ≤ (2N)2/N2 = 4 • Hence, 0 ≤ score(A,B) ≤ 4 • The smaller the score, the more similar

31. Opcode Graph Score Graph Techniques • Other score metrics considered… • Euclidean distance, for example • None gave better results • Other graph comparisons considered… • See “note” for this slide • Again, none gave better results • Our score is easy and fast to compute • But is it effective?

32. Data Files Graph Techniques • Metamorphic malware: 200 NGVCK • Benign: 41 cygwin utility files • These files used in other studies • In particular, HMM analysis, where accuracy was essentially 100% • Would be nice to have harder test data… • Can compare results to previous work • Metamorphic detection research, that is

33. Results Graph Techniques Important cases are “Metamorphic vs Metamorphic” and “Normal vs Normal”

34. Discussion Graph Techniques • Might argue that uncommon opcodes are “weighted” too heavily • Since all nodes count the same • E.g., MOV and FLDCW treated the same • So a few uncommon opcodes in malware might make it stand out from benign • Did tests removing uncommon opcodes • Next slide

35. Remove Uncommon Opcodes Graph Techniques • Metamorphic vs metamorphic • Before and after removal • Note that scores don’t change much

36. Remove Uncommon Opcodes Graph Techniques • After uncommon opcode removal • Metamorphic vs metamorphic… • And normal vs metamorphic • Still obtain good separation

37. Increased Morphing Graph Techniques • Dead code inserted from benign files • “Block morphing” used

38. Increased Morphing Rate Graph Techniques • At 30% block morphing • Misclassifications occur

39. Comparison to HMM Graph Techniques • Using same block morphing, scored files using HMM detector • At 30% block morphing, results comparable to previous slide • We conclude that our opcode graph score is comparable to HMM score • Analysis not detailed enough to say which is actually better • But we can say the difference is slight

40. Random Morphing Graph Techniques Benign vs morphed malware Scores worse at higher morphing???

41. Conclusion Graph Techniques • Simple opcode graph score tested • Good results, comparable to HMM • We showed how to defeat the score • How do results compare to complex opcode graph/SVM score? • Unfortunately, no direct comparison… • Opcode graphs based on opcode pairs • In that sense, similar to HMM…

42. Metamorphic Detection Using Function Call Graph Analysis • Prasad Deshpande • Mark Stamp Graph Techniques

43. Intro Graph Techniques • Function call graphs previously studied (a lot) for malware detection • Here, applied to metamorphic malware • Scoring technique used here follows previous work closely • First, brief background material • Then explain graph/scoring in detail • Finally, we give results

44. Background Graph Techniques • Metamorphic techniques • Register swap • Transposition • Dead code insertion • Instruction substitution • Formal grammar mutation • Host code mutation • Code integration

45. Background Graph Techniques • HMM-based detection • Again, HMM detection serves as a benchmark against which we compare • We’ve already seen the details • So, we’ll assume it’s known…

46. Function Call Graph Graph Techniques • Disassemble the program • Local functions look like sub_xxxxxx • External functions too, e.g., GetVersion • Each function is a node in our graph • Directed edges represent caller/callee relationships • Edges point to functions that are called • Edges found using breadth-first search

47. Function Call Graph Graph Techniques Example of part of function call graph

48. Call Graph Similarity Graph Techniques • How to compare function call graphs? • Local functions • Names will not match • But graph structure can be compared • External functions • Names should match • But not much graph structure available • How to combine local and external?

49. External Functions Graph Techniques • Given function call graphs G1 and G2 • Extract external functions from each • Compare 2 sets of function names • All matching names are saved for scoring • Matched names become vertices in graph • We use resulting vertices for scoring • Scoring details later…

50. Local Functions Graph Techniques • Methods to compare local functions • Based on external functions called • Based on opcode sequence similarity • Based on “matched neighbors” • This approach follows previous work • Each of 3 measures is reasonable • But overall, seems very ad hoc • Little confidence this is (near) optimal