Diversity Algorithms for Worrisome Software and Networks (DAWSON)

Diversity Algorithms for Worrisome Software and Networks(DAWSON) James Just, Nathan Li, Mark Cornwell Global InfoTek, Inc. Jeff Rowe, Tufan Demir UC Davis R. Sekar SUNY Stony Brook 15 December 2005

DAWSON Overview • Explores Biologically Inspired Diversity • Automatically generates a large number of program variants • Variants differ in terms of memory layout • Targets memory errors such as buffer overflows • Implemented on Microsoft Windows

Agenda • Introduction • Development Update • Testing Update • Analytic Update • Conclusions

Translation Wrapper Diversity System Functional Architecture Address randomization does not remove vulnerability but makes effect of attack unpredictable Normal user inputs work Attacker (memory error exploits) Modifications transform original stored program User Inputs Other System Resources Original Program Modified PE File, Loader & System Calls Some attacks fail because vulnerability is not at assumed address Other attacks fail because injected commands are wrong Optional Annotation File Transformed In-memory program PRNG* *Pseudo-Random Number Generator

Multi-Layer Defense Strategy Layer 1 - Prevent Remote Exploit of Memory Errors Layer 2 - Prevent Injected Code from Properly Executing Layer 3 - Prevent Bypass of Layer 2

CharacterizingDAWSON’sMulti-Layer Defenses Kernel memory Writable memory Executable memory STACK HEAP USER.EXE USER.DLL SYS.DLL SYS.SYS Layer 1 Exploit Randomize heap base Randomize base of main and thread stack Randomize Code Location Rebase DLL Randomize heap allocs Layer 2 Payload IAT Permutation PEB/SEH Masking* Payload Non-Bypass- ability* Layer 3

Development Update

DAWSON Development Phases • Phase 1: First 6 months: • Diversity approaches • Code transformation techniques • Phase 2: Second 6 months • Windows randomization integration • Application protection • Phase 3:Third 6 months • Host protection • Performance and memory efficiency • Extensive tests

DAWSON on a Host Remote Monitor & Controller (e.g., Blackboard) Messages Local Host Randomization Configuration

DAWSON Changes (Since July '05) • Primary Stack Randomization • Native API Augmentation • Coming – Kernel driver integration • System DLLs Base Randomization (Rebasing) • Kernel Mode Driver • PEB/SEH protection • Debugging API • 6 New Exploits • Extensive Testing • Test in small • Test in large • Red team exercise

DLL Rebasing Issues • Rebasing system DLLs like ntdll and kernel32 • Solution: use kernel-mode driver to rebase at boot-time • Cost Vs Benefit • Significant benefits: can break exploit and payload • Costs: • Performance impact to relocate code • Memory impact due to reduced sharing across processes • Options: • Baseline: Rebase and Share • shared, but introduces common vulnerabilities across all apps • Rebase on First Use • Rebase on Request • Configurable via registry settings

DAWSON Second Layer • Payload Execution Prevention • IAT permutation (Done prior to July PI meeting) • IAT used to lookup addresses of functions in DLLs • By permuting the order of IAT entries, attack code will access the wrong function • PEB/SEH Protection (New) • PEB is a data structure with the addresses of common API functions • PEB is memory protected, accessing PEB raises an exception • Exception Handler checks location of caller, if it is outside the program boundary, access is denied.

Rebasing Executables using Exception Handler Exception Handler with Address Map stack IAT IAT .text .text Address Map 1 3 2 2 1 3 2

Limitations

Where Absolute Address Randomization Fails • Non-pointer attacks • Overflow a buffer to corrupt nearby non-pointer data, e.g., string used as argument of execve • Relies on the ability to find security-critical data next to vulnerable buffers: Not very easy. • Attacks that can extract “randomization key” • “Information leakage attacks” • Relies on a vulnerability that sends back pointer values in a response to a request • Vulnerabilities shared by many other defenses • StackGuard, StackGhost, PointGuard and some ISR implementations

Repetitive attacks • Double-pointer attack • AAR provides only limited protection • Guessing attacks • Require of the order of 15K attempts • Solutions • Layer 2 defenses • Automated response: • Filtering based on automatically generated signature is a promising approach to address these • [Liang et al ’05] generate successful signatures to reliably block 10 of 11 attacks in their test suite. • Less than 10% performance overheads, no false positives.

Expected Attack Attempts for Conventional Attacks

Testing Update

Implementation Status • Kernel Driver: System DLLs randomization • Layer 1: • 2-Level Stack Randomization, including primary stack • 2-level Heap Randomization • Application DLLs randomization • EXE randomization when .reloc is available (included in synthetic vulnerable server) • Layer 2: • IAT permute and library name erase integrated • SEH/PEB protection developed, NOT integrated • Layer 3: Not integrated

Testing Changes(Since July 05) • Extended Benchmark Vulnerable Service to incorporate 15 vulnerabilities. • Extended Attack Corpus to 15 corresponding attacks packaged in Metasploit. • Extensive internal testing • Performed testing on Emulab to observe contributions of individual randomizations. • Automated testing on small scale 3-node in-house testbed and used results to refine/debug randomization software. • Built an iterative test to restart VulnSrv to support testing of brute force attacks. • Conducted Red Team Experiment in November.

Listening Thread Listening Thread Listening Thread Listening Thread Vulnerable Service Vulnerable Service Vulnerable Service Vulnerable Service DAWSON Testing Platform Monitoring Metasploit Attack Center Attack String

Key Test Characteristics • Vulnerabilities • Stack buffer Overflow • Format String • Integer Overflow • Heap Overflow • A function may have combinations of the vulnerabilities • Payloads: • Injected Code • Existing Code • Existing Program

Test Demo Randomization Blocks most attacks from the test corpus. All randomizations turned on. Single Kernel Randomization. Processes re-randomize every process start. 2 Dec 2005 12:12PM Kmd+1111 w/ConflResolv1201

Comparative Results

Overall Layer 1 Effectiveness Benchmark attacks against unrandomized baseline avg. penetration rate =100% With initial randomization, avg. penetration rate fell to 2.4% After further engineering effort, avg. penetration rate fell to 0.56% Test results show DAWSON randomization implementation is growing increasingly effective. Further to go to approach theoretical limits.

Breakdown of Individual Randomization Effectiveness Most effective when all techniques are used in combination. On unprotected system, all baseline attacks succeed Different randomization techniques are effective against some attack classes and not others.

Minor Performance Impacts • Heap transformations cause 5% overhead for apps that are intensive in heap allocations • Other transformations don’t add recurring cost • One-time overhead for relocation adds modestly to the load-time • Absolute address randomization does not change program locality • Most relocations occur at page granularity • Relative locations of objects unchanged within a page

Performance Impact * Data collected on a Pentium 4 1.2GHz CPU with 768MB RAM

Improving the Test Suite • Further Work • Add new exploits focusing on payload execution • Testing payload execution protection • Offer to security community • A package to test memory defense technologies • Open source vulnerable service with advanced memory errors and exploits (packaged as Metasploit modules)

DAWSON Red Team Exercise • Layer 1 blocked 15 of 16 attacks (many reps) • Red team identified a new “double vulnerability” • This unintended combination of a stack-buffer overflow and format-string vulnerabilities made the Red Team exercise a lot more interesting and useful! • Layer 2 blocked the 16th attack

Attack Outline • Vulnerable code (simplified):void vulnerable(char *attack) { char buf1[512], buf2[512]; strcpy(buf1, attack); sprintf(buf2, buf1); } • Attack • Guess a writable memory location X • Use format-string attack to inject code at X • Overflow buf2 to overwrite return address • Note: attack impossible if the order of declaration of buf1 and buf2 were interchanged! • Use brute-force to guess X

Attack Details: Layer I High DWORD WINAPI FormatStrThread(LPVOID lparameter { char safebuf[4096]; nRet=recv(peersock,safeBuf,sizeof(safeBuf),0); formatStrAttack(safeBuf,nRet); } MDVULN.dll – vulnerable service code evil char[4096] safeBuf formatStrThread sprintf(void*dest, char*fmt,…) …interprets lots of % conversion spex …to access stack in flexible …ways void formatStrAttack(char *sbuffer, int nSize) { char buf[512]; char bufmain[512]; sprintf(buf, “String : %s”,sBuffer); // (1) sprintf(bufmain,buf); // (2) } // (4) (1) • First sprintf copies the attack string from safeBuf into buf. • Second sprintf interprets “496c” in format string overflowing waddr into the return address location. • “%229c%hn%229c%hn” manipulates #chars written to write a JMP ESP instruction into 2 bytes at waddr. • Return from formatStrAttack branches to waddr and executs JMP ESP instruction. At this time ESP points into expanded format string near bufmain.. • ESP is manipulated to point to where shell code slid to inside the formatStrThread stack frame. • Normal return now branches to waddr where it executes the JMP ESP • ESP location contains shellcode on stack that gains control & bootstraps a DLL injection attack. raddr (2) 496 some page in memory formatStrAttack char[512] bufmain 0x7ffdxxxx (3) evil JMP ESP waddr Arg5 Arg4 Arg3 char[512] buf Arg2/fmt Arg1/dest sprintf Low Addresses embedded at start of attack string get interpreted as arguments to sprintf(*dest,*fmt,arg1,arg2,arg3,…)

Attack Details: Layer II • Metasploit shell code for DLL injection: • Uses PEB to look up GetProcAddress and LoadLibrary • Loads w32.dll and opens socket connection to call home. • Loads the injected DLL payload (hackmark.dll) into memory and tricks Windows into treating it as a and ordinary DLL linked & loaded. • Transfers execution to the init entry point in the DLL. DAWSON Layer 2 Catches and stops PEB access since it made from code executing from the stack. Shell code char[4096] safeBuf exploit parms

Estimating number of attempts needed • Attacker needs to guess a writable memory location X • Probability of correctly guessing X= fraction of writable memory in address space= 10MB/2GB = 0.005, for an app using 10MB data • Vulnerable server uses 0.5MB, so probability of success should be about 1/4000 • But Red Team succeeded in 128 attempts! Why? • Red Team was varying only the leading 8 bits of address • PEB was not relocated, and happened to be located at an address that matched the lower 24-bits used by Red Team • Red Team informed by the Blue team of this vulnerability • And the possibility of injecting code into PEB

Red Team Attack: Conclusions • DAWSON robust against attacks that exploit any single vulnerability in vulnerable server • Randomization is vulnerable to rare combinations of vulnerabilities • To be effective, all memory regions should be randomized • Non-randomization of PEB was the reason for Red Team to succeed in ~100 attempts as opposed to about ~4000 • Ongoing work with kernel driver will relocate PEB/SEH, thus addressing this weakness • Multi-layered approach is important • Layer 2 was able to defeat the attack even though the attack got through layer 1.

Tech Transition • Looking at Service IA entrance points, e.g., • CECOM/CERDEC (S&TCD) • Navy (NMCI) • Air Force • Initial ideas for commercial sales & support • Commercial partner • Spin-off • Other (GOTS)?

Further Development • Issues • Fixed PEB/TEB base location • Exception handler location • Initial process heap/CRT heap base randomization • Some things not exhaustively covered • Process/thread creation, memory allocation • Undocumented Native API • Occasional communication error with Win32 subsystem • Inadequate monitoring and control • Solutions • Kernel mode driver • Expanded vulnerabilities and attacks for Layer 2 testing • Control and alerting interfaces • Enterprise capabilities and productization needed

Thank You Questions?

Backup Slides

Analytic Update

Address Space Randomization (ASR) • Absolute address randomization • Randomize absolute address of an object • Distances between objects may not be randomized • Relative address randomization • Randomize distances between objects, even those within the same segment

Attacks on DAWSON • Exploit phase • Defeating randomization • Payload execution phase • Difficulty of successfully executing system functions needed to carry out the attack • Comes into play if and when DAWSON exploit protection is defeated

Probability of Successful Attacks Pr(A) = Pr(V)/[EE(A) * PEE(A)] • Success probability of attack A exploiting vulnerability V • EE: “exploit effort” • Given by range of randomization of addresses involved in A • PEE: “payload execution effort” • Attempts to successfully execute “attack payload” • Multiplicative effect • requires rerandomization after every failed attack • does not apply if attack defeats the same randomization in both layers

Layer 2 Threat Model • Injected code has begun execution • Attack needs to invoke system APIs to deliver its payload • No direct invocation of system calls • Supposed to be protected by layer 3 (not implemented for DAWSON) • Existing code attacks • Still requires breaking layer 1 defense to get to exploitable code within application • We estimate PEE(V) for other types of attacks

Data Attacks

PE(V) for conventional attacks • Stack-smashing • modify return address to point to injected code on stack • Range of possible code addresses is 1GB • Can improve success using NOP padding • With 1KB padding, PE(V) = 10-6 • Heap overflow • Relies on knowing absolute addresses • If target pointer is in static data area,PE(V) = 1GB/64KB = 15K • This estimate applies to many other attack types: return-to-libc, format-string,…

Attacks on DAWSON Randomization • Exploit weaknesses in randomization • Attacks that can extract “randomization key” • “information leakage attacks” • Partial overflow attacks • Overflow only the least significant byte of address • Double pointer attacks • Rely only on finding a writable address in memory • All require a combination of vulnerabilities • Low likelihood of finding them • Derandomization (brute-force) attacks • Analyzed work factor in the next slides. • [Liang et al ’05] approach promises to block these … • Automatically learn signatures of memory error exploits and discard subsequent instances of them • Shown to be very effective on recent attacks on Linux

Exception Handler Protection – The Numbers • Program address space is ~2Gb • Assume a program size of ~200 Mb • Dummy padding with alert functions and fail-crash code size is ~1.8 Gb • Attacker has a 1 in 500 Million chance of getting the right DLL address; 90% chance of tripping an alarm per try.

Attack Descriptions

Composite Results

Diversity Algorithms for Worrisome Software and Networks (DAWSON)