1 / 48

Malware Analysis and Instrumentation

Malware Analysis and Instrumentation. Andrew Bernat and Kevin Roundy. Paradyn Project. Center for Computing Science June 14, 2011. Forensic analysts need help. 90% of malware resists analysis [1] Malware attacks cost billions of dollars annually [2]

tracen
Télécharger la présentation

Malware Analysis and Instrumentation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Malware Analysis and Instrumentation • Andrew Bernat and Kevin Roundy Paradyn Project Center for Computing Science June 14, 2011

  2. Forensic analysts need help • 90% of malware resists analysis[1] • Malware attacks cost billions of dollars annually[2] • 65% of users feel effect of cyber crime[3] • 69% cybercrimes are resolved[3] • 28 days on average to resolve a cybercrime[3] Malware Binary 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 0c 85 a5 94 2b 20 fd 5b 95 Malware Analysis and Instrumentation [1] McAfee. 2008 [2] Computer Economics. 2007 [3] Norton. 2010

  3. Forensic analysts need help The needed toolbox • Binary code identification • Control- and data-flow analysis • Instrumentation • Effectiveness on malware Malware Binary 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 0c 85 a5 94 2b 20 fd 5b 95 Malware Analysis and Instrumentation

  4. Dyninst is a toolbox for analysts library injection function replace- ment loop, block, function, instruction instrument- ation symbol table reading, writing forward & backward slices machine language parsing CFG loop analysis call stack walking Dyninst Dyninst binary rewriting program binary 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 Control flow analyzer Data flow analyzer Instrumenter process control Malware Analysis and Instrumentation

  5. Dyninst is a toolbox for analysts Analysis tool library injection function replace- ment loop, block, function, instruction instrument- ation symbol table reading, writing Mutator forward & backward slices machine language parsing CFG CFG • Specifies instrumentation • Gets callbacks for runtime events • Builds high-level analysis loop analysis call stack walking Dyninst Dyninst binary rewriting program binary 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 Control flow analyzer Data flow analyzer Instrumenter process control Malware Analysis and Instrumentation

  6. Dyninst is a toolbox for analysts Code snippets printf(…) getTarget(insn) counter++ if (pred) callback(…) Code visualizations Analysis tool Analysis of network communications Mutator CFG • Specifies instrumentation • Gets callbacks for runtime events • Builds high-level analysis Time bomb detection and analysis Identification of stolen data Dyninst program binary 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 Reports on anti-analysis techniques Control flow analyzer Data flow analyzer Instrumenter Malware Analysis and Instrumentation

  7. Dyninst on malware Code snippets printf(…) getTarget(insn) counter++ if (pred) callback(…) Code visualizations Code visualizations Malware defeats static analysis & is sensitive to instrument-ation Analysis tool Analysis of network communications Analysis of network communications Mutator CFG • Specifies instrumentation • Gets callbacks for runtime events • Builds high-level analysis Time bomb detection and analysis Time bomb detection and analysis Identification of stolen data Identification of stolen data Dyninst malware binary 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 Reports on anti-analysis techniques Reports on anti-analysis techniques Control flow analyzer Data flow analyzer Instrumenter Malware Analysis and Instrumentation

  8. Dyninst on malware Code snippets printf(…) getTarget(insn) counter++ if (pred) callback(…) Code visualizations Malware defeats static analysis & is sensitive to instrument-ation Analysis tool Analysis of network communications Mutator CFG CFG • Specifies instrumentation • Gets callbacks for runtime events • Builds high-level analysis Time bomb detection and analysis Identification of stolen data SR- Dyninst Dyninst malware binary 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 static-dynamic analysis Sensitivity Resistant Instrumenter Reports on anti-analysis techniques Control flow analyzer Data flow analyzer Instrument-er Control flow analyzer Data flow analyzer CFG Malware Analysis and Instrumentation

  9. Outline Anti-analysis tricks Hybrid static-dynamic analysis Sensitivity resistance Results Anti H.A. S.R. Res. 9 Malware Analysis and Instrumentation

  10. Anti-analysis tricks Anti Obfuscated control flow Obfuscated control flow indirect control flow, stack tampering, overlapping code, signal-based ctrl flow Unpacked code Unpacked code all-at-once, block-, loop-, function-at-a-time, to empty or allocated space Anti-analysis Overwritten code single operand or opcode, whole instruction, function, code section, buffer Overwritten code PC-sensitive code PC-sensitive code call-pop pairs, return-address manipulation, call-stack tampering & probing Anti-patching Anti-patching checksum whole regions, probe for patches, use code as data, move stack ptr Anti-instrumentation Address-space probing Address-space probing scans & probes of locations that should be un-allocated Malware Analysis and Instrumentation

  11. Obfuscated control flow Anti 40d002 storm worm Entry Point obfuscated control flow obfuscated control flow unpacked code overwritten code pc-sensitive code anti-patching address-space probing Malware Analysis and Instrumentation

  12. Unpacked code Anti storm worm Entry Point obfuscated control flow obfuscated control flow unpacked code overwritten code 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 0c 85 a5 94 2b 20 fd 5b 95 e7 c2 16 90 14 8a 14 26 60 d9 83 a1 37 1b 2f b9 51 84 02 1c 22 8e 63 01 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 0c 85 a5 94 2b 20 fd 5b 95 e7 c2 16 90 14 8a 14 26 60 d9 83 a1 37 1b 2f b9 51 84 02 1c 22 8e 63 01 pc-sensitive code anti-patching address-space probing 12 Malware Analysis and Instrumentation

  13. Overwritten code Anti Entry Point Upack packer obfuscated control flow obfuscated control flow unpacked code overwritten code 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 0c 85 a5 94 2b 20 fd 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 5b 95 e7 c2 16 90 14 8a 14 26 60 d9 83 a1 37 1b 2f b9 51 84 02 1c 22 8e 63 01 pc-sensitive code anti-patching address-space probing 13 Malware Analysis and Instrumentation

  14. PC Sensitive code Anti e.g., ASProtect Use call to get current PC Local Data Access call data Pop PC into register obfuscated control flow obfuscated control flow pop esi add esi, eax mov ebx, ptr[esi] unpacked code Construct pointer and dereference overwritten code pc-sensitive code anti-patching address-space probing 14 Malware Analysis and Instrumentation

  15. Anti-patching Anti Checksumming detects instrumentation [Aucsmith 96] e.g., PECompact checksum routine protected code xoreax, eax calculate checksum of protected region add eax, ptr[ebx] add ebx, 4 cmpebx, 0x41000 jne .loop obfuscated control flow compare to expected value cmpeax, .chksum jne .fail jmp unpacked code overwritten code pass fail fail instrument-ationis detected pc-sensitive code anti-patching address-space probing 15 Malware Analysis and Instrumentation

  16. Address-space probing Anti code Memory Scan int *ptr = 0; data segv_handler() { ptr += PAGESIZE; goto RESTART: } code instrumentation sigaction(SIGSEGV, segv_handler); while(1) { RESTART: *ptr; ptr += PAGESIZE; } obfuscated control flow obfuscated control flow unpacked code overwritten code pc-sensitive code anti-patching address-space probing 16 Malware Analysis and Instrumentation

  17. Code discovery algorithm H.A. Hybrid algorithm: Parse from known entry points Instrument control flow that may lead to new code Resume execution ? ? instrument overwrite exception CALL ptr[eax] DIV eax, 0 Malware Analysis and Instrumentation

  18. Code discovery algorithm H.A. Hybrid algorithm: Parse from known entry points Instrument control flow that may lead to new code Resume execution ? ? instrument overwrite exception CALL ptr[eax] DIV eax, 0 Malware Analysis and Instrumentation

  19. Code discovery algorithm H.A. Hybrid algorithm: Parse from known entry points Instrument control flow that may lead to new code Resume execution ? ? instrument overwrite exception CALL ptr[eax] DIV eax, 0 Malware Analysis and Instrumentation

  20. Code discovery algorithm H.A. Hybrid algorithm: Parse from known entry points Instrument control flow that may lead to new code Resume execution ? ? instrument overwrite exception CALL ptr[eax] DIV eax, 0 Malware Analysis and Instrumentation

  21. Code discovery algorithm H.A. Hybrid algorithm: Parse from known entry points Instrument control flow that may lead to new code Resume execution ? instrument overwrite exception CALL ptr[eax] DIV eax, 0 Malware Analysis and Instrumentation

  22. Accurate parsing H.A. • Standard control-flow traversal • start from known entry points • follow control flow to find code • New conservative assumption • unresolved calls may not return • So, we don’t parse garbage code • Newstack tamper detection • backwards slice at ret instruction • So,we detect modified return • addresses call ptr[eax] garbage pop ebp inc ebp push ebp ret Hybrid Analysis of Program Binaries

  23. Instrumentation-based discovery H.A. Invalid control transfers Indirect control transfers Exception-based control transfers call 401000 Invalid Region jmp eax call ptr[eax] push eax ? ? ret xor eax, eax mov ebx, ptr[eax] Exception Handler Malware Analysis and Instrumentation

  24. Instrumentation-based discovery H.A. Dyninst process … call ptr[eax] ? Hybrid Analysis of Program Binaries

  25. Instrumentation-based discovery H.A. Dyninst process … call ptr[eax] … call ptr[eax] jmp 823456 save state call findTarget (ptr[eax]) restore state findTarget(targ) { if ( !cacheLookup(targ) ) RPC_updateAnalysis(targ); } Hybrid Analysis of Program Binaries

  26. Overwritten code discovery H.A. Dyninst write RWX RWX RWX 26 Malware Analysis and Instrumentation

  27. H.A. Overwritten code discovery Dyninst • When to update • Challenges • large incremental overwrites • writes to data • writes to own page code write handler CFG update routine write R E R E R E 27 Hybrid Analysis of Program Binaries

  28. D.A. Overwritten code discovery Dyninst • When to update • Challenges • large incremental overwrites • writes to data • writes to own page • Approach • Delay the update until write routine terminates code write handler CFG update routine write R E R E R E 28 Hybrid Analysis of Program Binaries

  29. Overwritten code discovery H.A. Dyninst • Update after overwrite • Handle overwrite signal • instrument write loop exits • copy overwritten page • restore write permissions • resume execution • Update CFG when writes end • remove overwritten and unreachable blocks • parse at entry points to overwritten regions • remove write permissions • resume execution • Update after overwrite • Handle overwrite signal • instrument write loop exits • copy overwritten page • restore write permissions • resume execution • Update CFG when writes end • remove overwritten and unreachable blocks • parse at entry points to overwritten regions • remove write permissions • resume execution code write handler CFG update routine write cb cb R-X RWX R-X R-X 29 Malware Analysis and Instrumentation

  30. Overwritten code discovery H.A. Dyninst • Update after overwrite • Handle overwrite signal • instrument write loop exits • copy overwritten page • restore write permissions • resume execution • Update CFG when writes end • remove overwritten and unreachable blocks • parse at entry points to overwritten regions • remove write permissions • resume execution code write handler CFG update routine write cb cb R-X R-X RWX R-X 30 Malware Analysis and Instrumentation

  31. Behavior Changes S.R. Program modification affects local behavior These changes propagate Malware detects changes (or crashes) Malware Analysis and Instrumentation

  32. Sensitivity Resistant Approach S.R. • Identify instructions sensitive to modification • Moved instructions that access the program counter • Memory operations that may access patched code • Memory operations that may scan the address space • Project effects on program behavior • Are output (or control flow) affected? • Use a forward slice and symbolic evaluation • Determine how to compensate for modification • E.g. by emulating the original instruction Malware Analysis and Instrumentation

  33. PC-sensitivity analysis S.R. Sensitive: call foo Slice: call foo ret Symbolic expansion: pc = $retAddr + $delta main: reloc_main: main: call foo ... call next <data> next: pop %esi add %esi, %eax mov (%esi), %ebx jmp %ebx foo: ... ret main: call foo ... push $next pop %esi add %esi, %eax mov (%esi), %ebx jmp %ebx Sensitive: call next Slice: call next pop %esi add %esi, %eax mov %(esi), %ebx jmp %ebx Symbolic expansion: pc = [$next + %eax + $delta] Malware Analysis and Instrumentation

  34. Sensitivity Classes S.R. • PC (program counter) sensitive • Moved instruction that accesses the PC • CF (control flow) sensitive • Instruction whose control flow successor was moved • CAD (code as data) sensitive • Instruction that reads from overwritten memory • AVU (allocated vs. unallocated) sensitive • Instruction that accesses newly allocated memory Malware Analysis and Instrumentation

  35. Visible Compatibility S.R. • What behavior do we need to preserve? • Allow localized changes that aren’t visible from outside the program • Preserve: • Output • Approximation: control flow Malware Analysis and Instrumentation

  36. Handling CAD Sensitivity S.R. code checksum routine patch data xor eax, eax patch code jmp 863828 add eax, ptr[ebx] add ebx, 4 cmpebx, 0x41000 jne .loop add ebx, 4 cmpebx, 0x41000 jne .loop save state patch emulate (add eax, ptr[ebx]) restore state cmpeax, .chksum jne .fail instrumentation pass fail fail shadow memory Malware Analysis and Instrumentation

  37. Emulating Memory (Simplified) S.R. • Save state • Determine effective address • Translate effective address • Restore state • Emulate original memory instruction push %eax push %ecx push %edx lahf push %eax lea <original>, %ebx call translate pop %eax sahf pop %edx pop %ecx pop %eax mov (%ebx), %ebx Malware Analysis and Instrumentation

  38. The Devil in the Details S.R. • IA-32 is a rich instruction set • Most instructions can access memory • And malware uses a wide variety of them • Instruction classes: • Most common: MOD/RM byte • Less common: “string” operations • Least common: absolute address Malware Analysis and Instrumentation

  39. String Operations S.R. <save> mov %edi, %edx mov %esi, %ecx call TranslateShift add %edx, %edi add %ecx, %esi movs sub %edx, %edi sub %ecx, %esi <restore> movs • “String” instructions implicitly use ESI/EDI • scas/lods/stos/movs/cmps/ins/outs • Some update ESI/EDI, making emulation tricky • Malware loves these for copying blocks of memory Malware Analysis and Instrumentation

  40. Address-space scanning S.R. code scan routine patch data xor eax, eax code patch movptr[eax], ebx add eax, 4 cmpeax, 0 jne .loop jmp 863828 add eax, 4 cmpebx, 0 jne .loop save state patch emulate (movptr[eax], ebx) restore state call chk_mem instrumentation pass fail fail segv_handler ... dyn_segv_handler ... ... Malware Analysis and Instrumentation

  41. Exception Handler Interposition S.R. Windows Libraries push %eax push %ecx push %edx lahf push %eax lea <original>, %eax call translate pop %eax sahf pop %edx pop %ecx pop %eax mov (%eax), %eax Exception Record Faulting insn: <reloc_addr> Faulting addr: 0 Registers: Faulting insn: <orig_addr> Faulting addr: <eff_addr> Registers: dyn_segv_handler ... ... segv_handler ... Malware Analysis and Instrumentation

  42. The packers we’re studying Res. SR-Dyninst Packer Malware market share[1] Obfuscated Self-modifying Anti instru-mentation Dyninst √ UPX 9.45% √ PolyEnE 6.21% yes EXECryptor 4.06% yes yes yes x yes yes yes x Themida 2.95% yes yes yes PECompact 2.59% √ √ Upack 2.08% yes yes nPack 1.74% √ anti-debugging techniques √ Aspack 1.29% yes yes √ FSG 1.26% yes √ yes Nspack 0.89% yes yes Asprotect 0.43% yes yes √ x Armadillo 0.37% yes yes yes Yoda's Protector 0.33% yes yes yes √ √ WinUPack 0.17% yes yes MEW 0.13% √ yes Malware Analysis and Instrumentation [1] Packer (r)evolution. Panda Research, 2008. Two-month average Feb-March 2008.

  43. Sample malware analysis factory Res. Controlflow graph showing executed blocks Stack trace at 1st network communication 200 binaries malware binary 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 malware binary 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 malware binary 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 malware binary 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 Trace of Win API calls comprehensive instrumentation network call instrumentation Defensive tactics report • unpacked code • overwritten code • control flow obfuscations SD-Dyninst Malware Analysis and Instrumentation

  44. Factory results for Conficker A Res. packed payload initial bootstrap code Malware Analysis and Instrumentation

  45. Factory results for Conficker A Res. unpacked block static block API func non executed block

  46. Factory results for Conficker A Res. Instrument network calls and perform a stack-walk Stack-walk of Conficker’s communications thread Frame pc=0x100016f7 func: DYNstopThread at 0x100001670 [Dyninst] Frame pc=0x71ab2dc0 func: select at 0x71ab2dc0 [Win DLL] Frame pc=0x401f34 func: nosym1f058 at 0x41f058 [Conficker] (We can also print stackwalks of Conficker’s other threads) Malware Analysis and Instrumentation

  47. Improved Dyninst overhead Res. • Reduced relocation overhead despite emulation • Better handling of program features • Exceptions • Indirect control flow Malware Analysis and Instrumentation

  48. Conclusion • SR-Dyninst gives you • All the benefits of Dyninst on malware • Safer instrumentation on normal binaries • Ongoing work • Anti-debugger techniques • More descriptive CFGs • Automated defensive-mode activation • SR-Dyninst in next Dyninst release Malware Analysis and Instrumentation

More Related