1 / 40

Enhancing Security of Real-World Systems with a Better Understanding of the Threats

Enhancing Security of Real-World Systems with a Better Understanding of the Threats . Shuo Chen Candidate of Ph.D. in Computer Science Center for Reliable and High Performance Computing Coordinated Science Laboratories University of Illinois at Urbana-Champaign. My Dissertation.

vida
Télécharger la présentation

Enhancing Security of Real-World Systems with a Better Understanding of the Threats

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Enhancing Security of Real-World Systems with a Better Understanding of the Threats Shuo Chen Candidate of Ph.D. in Computer Science Center for Reliable and High Performance Computing Coordinated Science Laboratories University of Illinois at Urbana-Champaign

  2. My Dissertation • Security Threat Analysis and Mitigations in Real-World Systems • Investigate the impact of hardware memory errors on the security of Internet servers and firewalls. • Simulate random hardware memory errors • Stochastic model to estimate the probability of security violations. • Analyze and model a wide spectrum of software security vulnerabilities reported by CERT and Bugtraq. • Decompose each vulnerability to many primitive operations. • Introduce formalism into reasoning and description of real vulnerabilities. • Interesting outcome: discovered a new security bug in an HTTP server, now published in Bugtraq. • Construct non-traditional methods to attack major Internet server programs without being detected by most current defense techniques. This represents a new challenge for defense research. • Develop techniques to provide a better security protection for real-world systems • A theorem proving based code analysis • A processor architecture level runtime defense Earlier work Focus of this talk

  3. PART I:Analyzing and Identifying Security Threats on Real-World Software

  4. Significance of Memory Vulnerabilities • CERT Advisories:  66% vulnerabilities are low level memory errors in software. • Widely exploited by attackers, worms and viruses.

  5. Widely Understood Threats of Memory Corruptions • Once a memory error is found, it is straightforward to take control of the victim system by control-hijacking attacks. • First, overwrite control data, such as return addresses, function pointers, GOT entries or DTOR entries. • Program control is hijacked to execute code with malicious purposes. • The malicious code is able to make system calls with the privilege of the victim process. Do real damages to the system.

  6. Current Techniques to Defeat Memory Corruption Attacks • Control hijacking is the most dominant form of memory corruption attacks (CERT and Microsoft Security Bulletin) • Accordingly, many current defense techniques are designed to enforce program control flow integrity in order to provide software security. This research area has been active for many years. • A common justification: attacks not hijacking program control flow (i.e., non-control-hijacking attacks) are rare against real-world software. • Important question: • How confident can we rely on this justification to build defenses? • Is it possible that people currently underestimate the real threats of memory corruption attacks? • Specifically, dominance of control-hijacking attacks  attackers’ incapability or lack of incentive to mount non-control-hijacking attacks?

  7. Our Claim: General Applicability of Non-control-hijacking Attacks • Our previous papers suggest an initial doubt • Even random hardware memory errors can subvert the security of real-world systems with a non-negligible probability. None of the compromises is due to control hijacking. • Software vulnerabilities are more deterministic and more amenable to attacks. Why attackers are incapable to mount non-control-hijacking attacks against real-world systems? • We make a hypothetical claim: • Many real-world software applications are susceptible to non-control-hijacking attacks; • The severity of the attack consequences is equivalent to that due to control hijacking attacks. • If the claim is indeed true, it represents a new challenge to defense techniques.

  8. Goal: Empirical Validation of the Claim • Investigate many “representative software applications”. Try to break into them using non-control-hijacking attacks. • Choose representative software applications • We did a quick survey on the recent four years of CERT advisories. Over 1/3 vulnerabilities are in FTP, SSH, Telnet and HTTP servers. • Construct non-control-hijacking attacks to compromise these servers. Each attack results in the root compromise of the victim server.

  9. x uninitialized, run as EUID 0 x=109, run as EUID 0 x=109, run as EUID 109. Lose the root privilege! Get a special SITE EXEC command. Exploit a format string vulnerability. x= 0, still run as EUID 109. Get a data command (e.g., PUT) x=0, run as EUID 0 x=0, run as EUID 0 Non-control-hijacking attack on WU-FTP Server (via a format string bug) int x; FTP_service(...) { authenticate(); x = user ID of the authenticated user; seteuid(x); while (1) { get_FTP_command(...); if (a data command?) getdatasock(...); } } getdatasock( ... ) { seteuid(0); setsockopt( ... ); seteuid(x); } When return to service loop, still runs as EUID 0 (root). Allow me to upload /etc/passwd I can grant myself the root privilege! Only corrupt an integer, not control hijacking.

  10. Non-control-hijacking attack on NULL-HTTP Server (via a heap overflow bug) • Attack the configuration string of CGI-BIN path. • Mechanism of CGI • suppose server name = www.foo.comCGI-BIN = /usr/local/httpd/exe • Requested URL = http://www.foo.com/cgi-bin/bar • The server executes • Our attack • Exploit a heap overflow vulnerability to overwrite CGI-BIN to /bin • Request URL http://www.foo.com/cgi-bin/sh • The server executes /usr/local/httpd/exe /bar /bin /sh The server gives me a root shell! Only overwrite four characters in the CGI-BIN string. Not control hijacking.

  11. auth = 0 auth = 0 auth = 1 auth = 1 Password incorrect, but auth = 1 Logged in without correct password Non-control-hijacking attack on SSH Communications SSH Server (via an integer overflow bug) void do_authentication(char *user, ...) { int auth = 0; ... while (!auth) { /* Get a packet from the client */ type = packet_read(); switch (type) { ... case SSH_CMSG_AUTH_PASSWORD: if (auth_password(user, password)) auth =1; case ... } if (auth) break; } /* Perform session preparation. */ do_authenticated(…); }

  12. More non-control-hijacking attacks • Against NetKit Telnet server (default Telnet server of Redhat Linux) • Exploit a heap overflow bug • Overwrite two strings:/bin/login –h foo.com -p (normal scenario) /bin/sh –h –p -p (attack scenario) • The server runs /bin/sh when it tries to authenticate the user. • Against GazTek HTTP server • Exploit a stack buffer overflow bug • Send a legitimate URL http://www.foo.com/cgi-bin/bar • The server checks that “/..” is not embedded in the URL • Exploit the bug to change the URL to http://www.foo.com/cgi-bin/../../../../bin/sh • The server executes /bin/sh

  13. Implications of Non-Control-Hijacking Attacks • Control flow integrity is not a sufficiently accurate approximation to software security. • Given a memory bug in a real software, attackers’ behaviors can be very diversified. • Although non-control-hijacking attacks are specific to application semantics, there are many types of non-control data critical to software security • E.g., user identity data, configuration data, user input data and decision-making Booleans. • Once attackers have the incentive, they are likely to succeed in non-control-hijacking attacks.

  14. Re-Examining Current Defense Techniques • They were mainly tested against control-hijacking attacks. Need to re-examine the effectiveness. • Many of them are based on control flow integrity • Monitor system call sequence • Protect control data • Non-executable stack and heap • Pointer encryption (PointGuard) • Need to encrypt pointers in libraries to be effective (challenging because no enough type info, type casting very often, performance). • Address space randomization • Good idea. In each run of the program, memory layout is different. • Challenging to deploy on all program segments. • Even every segment is randomized, a recent paper shows the deployment on 32-bit address space doesn’t provide enough entropy. • StackGuard, Libsafe and FormatGuard • They are specific to defeat stack smashing attacks and format string attacks. Not generic solutions. • Building a generic and secure defense technique to defeat memory corruption attacks is still an open problem. • Future defense research should consider non-control-hijacking attacks more seriously.

  15. PART II:Pointer Taintedness Detection: Towards a Better Security Protection for Real-World Systems

  16. Pointer Taintedness • Pointer Taintedness: a pointer value, including a return address, is derived from user input. • Most memory corruption attacks are due to pointer taintedness. • It allows attackers to specify the memory locations to read, write or transfer control to. Usually a pathological program behavior. • Pointer taintedness provides a unifying perspective for reasoning about a significant number of security vulnerabilities.

  17. Most Memory Corruption Attacks are Due to Pointer Taintedness • Format string attack • Taint an argument pointer of functions such as printf, fprintf, sprintf and syslog. • Stack buffer overflow (stack smashing) • Taint a function frame pointer or a return address. • Heap corruption • Taint the free-chunk doubly-linked list of the heap. • Glibc globbing attack • User input resides in a location that is used as a pointer by the parent function of glob().

  18. Stack Buffer Overflow Frame pointer or return address can be tainted. Vulnerable code: char buf[100]; strcpy(buf,user_input); High Return addr Frame pointer buf[99] … buf[1] buf[0] user_input Stack growth buf Low

  19. fmt: format string pointer ap: argument pointer fmt: format string pointer ap: argument pointer Format String Attack Vulnerable code: recv(buf); printf(buf); /* should be printf(“%s”,buf) */ \xdd \xcc \xbb \xaa %d %d %d %n High … %n %d %d %d 0xaabbccdd Stack growth Low In vfprintf(), if (fmt points to “%n”) then **ap = (character count) *ap is a tainted value.

  20. Heap Corruption Attack Free chunk A Vulnerable code: buf = malloc(1000); recv(sock,buf,1024); free(buf); Allocated buffer buf user input Free chunk B fd=A bk=C In free(): B->fd->bk=B->bk; B->bk->fd=B->fd; Free chunk C When B->fd and B->bk are tainted, the effect of free() is to write a user specified value to a user specified address.

  21. Building Defense Techniques based on Pointer Taintedness • Static code analysis: analyze the source code to extract the conditions under which the possibility of pointer taintedness exists. • To uncover potential vulnerabilities • Runtime detection: monitor at runtime whether a tainted value is dereferenced as a pointer. • To defeat memory corruption attacks (both control-hijacking and non-control-hijacking attacks)

  22. Project AFormal Reasoning about Pointer Taintedness: To Extract Security Specifications of Library Functions

  23. Project Overview • Our analysis on CERT advisories shows • A significant portion of vulnerabilities ( 33.6%) due to errors in library functions or incorrect invocations of library functions. • Need a more rigorous reasoning on library function specifications. • Library function specifications are currently ad-hoc. Many of them are specified after real attacks are discovered. • printf(fmt,…): fmt cannot be a user-specified string • strcpy(d,s): the length of string s should not exceed the size of buffer d, and d and s cannot be overlapped. • d= savestr(s): do not free d if this is not the first invocation of savestr. • free(p): p must be a pointer obtained from a previous malloc; p cannot be freed before. • glob(p): p cannot be a string starting with ‘~’ and ending with ‘{’. • What is a unified reason why these specifications are required? • Answer: they are required to eliminate the possibility of pointer taintedness. • Extraction of security specifications of a function is reduced to a theorem proving problem: under which conditions can a function eliminate the possibility of pointer taintedness. • I develop an equational logic based theorem proving approach to extract security specifications.

  24. Extracting Function Specifications by Theorem Prover Automatically translated to formal semantic representation C source code of a library function formal semantic representation Theorem generation For each pointer dereference in an assignment, generate a theorem stating that the pointer is not tainted Theorem proving A set of sufficient conditions that imply the validity of the theorems. They are the security specifications of the analyzed function.

  25. int vfprintf (FILE *s, const char *format, va_list ap) { char * p, *q; int done,data,n,state; char buf[10]; p=format; done=0; if (p==NULL) return 0; state=NO_PENDING; while (*p != 0) { if (state==NO_PENDING) { if (*p=='%') state=PENDING; else outchar(s,*p); } else { switch (*p) { case '%': outchar(s,'%') break; case 'd': data=va_arg (ap, int); if (data<0) { outchar(s,'-'); data=-data; } n=0; while (data>0 && n<10) { buf[n]=data%10+'0'; data/=10; n++; } while (n>0) { n--; outchar(s,buf[n]); } break; case 's': q=va_arg (ap, char *); if (q==NULL) break; while (*q!=0) { outchar(s,*q) q++; } break; case 'n': q= va_arg(ap,void*) ; *(int*) q = done; break; default: outchar(s,*p) } state=NO_PENDING; } p++; } return done; } Example: vfprintf() Theorem1: buf+n should not be a tainted value Theorem2: q should not be a tainted value

  26. Suggest the scenario of format string vulnerability Extracting the Specifications of vfprintf() • Try to prove the two theorems • Initially, the theorem prover cannot complete the proof, because the theorems are only valid under certain preconditions. • Add these preconditions as axioms to the theorem prover. • Repeat the above step until the theorems are proved. • Finally, the following four preconditions are added, which are the specifications of vfprintf (FILE *s, const char *format, va_list ap) • ap never points to any location within the current function frame. • *ap never points to the location of variable ap, i.e., *ap  &ap • Suppose the memory segment that ap sweeps over is called ap_activitiy_range, then *ap never points to any location within ap_activitiy_range. • No locations within ap_activitiy_range are tainted before vfprintf() is called.

  27. Other Studied Examples • Function strcpy() • Four security specifications indicating buffer overflow, buffer overlapping and buffer underflow scenarios causing pointer taintedness. • Function free() of a heap management system • Seven security specifications are extracted, including several specifications indicating heap corruption vulnerabilities. • Socket read functions of Apache HTTPD and NULL HTTPD • The Apache function is proven to be free of pointer taintedness. • Two (known) vulnerabilities are exposed in the theorem proving process of NULL HTTPD function.

  28. Project BRuntime Pointer Taintedness Detection: To Defeat Memory Corruption Attacks

  29. Project Overview • We propose a processor architectural level mechanism to detect pointer taintedness • Implemented on SimpleScalar simulator • An extended memory system with taintedness bit attached to every byte • Enhanced load, store and ALU instructions to track taintedness bits in memory • Detecting security attacks when tainted data are dereferenced. • Evaluation • It detects both control hijacking and non-control-hijacking attacks against real-world software. • No known false positive: no alarm during normal executions of network servers and SPEC benchmarks. Fully compatible to existing applications. • Transparent to applications. We can run precompiled binaries on the architecture. • Some potential false negative scenarios. They are rare and not defeated by current generic detection techniques either.

  30. Conclusions

  31. Conclusions • Our analysis shows that real-world software can be compromised by corrupting non-control data. Non-control-hijacking attacks represent a realistic threat. • It is insufficient to rely on control flow integrity for software security. • Pointer taintedness is a common characteristic of most memory corruption attacks, including control hijacking and non-control-hijacking attacks. • A theorem proving based code analysis approach is designed to reason about possibilities of pointer taintedness. • E.g., to formally extract security specifications of library functions. • A runtime pointer taintedness detection mechanism is designed. It can effectively detect most memory corruption attacks.

  32. Summary of My Research Methodology • Analysis-centric approach • Analyzed impact hardware faults on security (fault injection + stochastic modeling) • Analyzed Bugtraq and CERT vulnerability databases • Analyzed application source code, attacks and current defense techniques • Analysis results motivate • To expose new security threats • Propose new defense techniques • I like doing analysis of real data and incidents • Tedious? Sometimes, but it is a crucial step toward a lot of fun. • Rewarding? Definitely. Analysis is especially important for systems research. • Goal: strongly motivate research topics that solve problems in the reality.

  33. Backup Slides

  34. Static and Dynamic Approaches • Static approaches (avoid producing memory vulnerabilities in programs) • Writing code with type safe language • Compiler techniques to uncover memory vulnerabilities • Compiler instruments source code according to program annotations. • Challenges: legacy code and low level code, compatibility and performance. • Fact: Memory vulnerabilities are still constantly discovered and exploited. • Intrusion detection techniques (defeat attacks, given the existence of vulnerabilities) • Specialized techniques • Defeat stack buffer overflow and format string attacks. • Generic defense techniques • Most techniques are designed to defeat control-hijacking attacks. Host intrusion detection system and control flow integrity protection techniques. very active research area. • Others have constraints and difficulties in their deployments. (pointer encryption and address randomization)

  35. One-Slide Intro to Equational Logic • Use term rewriting to establish proofs of theorems. • Natural number addition expressed in the Maude system. 0 : Natural . s_ : Natural -> Natural . _+_ : Natural Natural -> Natural . vars N M : Natural Axiom: N + 0 = N . Axiom: N + s M = s (N + M) . (s s s 0) + (s s 0) = s ((s s s 0) + (s 0)) = s( s((s s s 0) + 0)) = s(s((s s s 0)) = s s s s s 0 Intuitively, this is a proof of “3 + 2 = 5” in natural number algebra.

  36. Axioms of Eval and ExpT operations Eval(S, I) = I // I is an integer constant Eval(S, ^ E1) = Ftch(S, Eval(S,E1)) Eval(S, E1 + E2) = Eval(S, E1) + Eval(S, E2) Eval(S, E1 - E2) = Eval(S, E1) - Eval(S, E2) … … ExpT (S, I) = false ExpT(S, ^ E1) = LocT(S,Eval(S,E1)) ExpT(S,E1 + E2) = ExpT(S,E1) or ExpT(S,E2) ExpT(S,E1 - E2) = ExpT(S,E1) or ExpT(S,E2) … … E.g., is the expression (^100)–2 tainted in store S? ExpT(S, (^100)–2) = ExpT(S, (^100)) or ExpT(S, 2) = LocT(S,100) or false = LocT(S,100) Note: ^ is the dereference operator, ^100 gives the content in the location 100

  37. Taintedness-Aware Memory Model • Astore represents a snapshot of the memory state at a point in the program execution. • For each memory location, we can evaluate two properties: content and taintedness (true/false). • Operations on memory locations: • The fetch operation Ftch(S,A)gives the content of the memory address A in store S • The location-taintedness operation LocT(S,A) gives the taintedness of the location A in store S • Operations on expressions: • The evaluation operation Eval(S,E)evaluates expression E in store S • The expression-taintedness operation ExpT(S,E) computes the taintedness of expression E in store S.

  38. Semantics of Language L • The following instructions are defined: • mov [Exp1] <- Exp2 • branch (Condition) Label • call FuncName(Exp1,Exp2,…) • Axioms defining mov instruction semantics • Specify the effects of applying mov instruction on a store • Allow taintedness to propagate from Exp2 to [Exp1]. • Axioms defining the semantics of recv (similarly, scanf, recvfrom: user input functions) • Specify the memory locations tainted by the recv call.

  39. Example: strcpy() char * strcpy (char * dst, char * src) { char * res; 0: res =dst; while (*src!=0) { 1: *dst=*src; dst++; src++; } 2: *dst=0; return res; } 0: mov [res] <- ^ dst lbl(#while#6) branch (^ ^ src is 0) #ex#while#6 1: mov [^ dst] <- ^ ^ src mov [dst] <- (^ dst) + 1 mov [src] <- (^ src) + 1 branch true #while#6 lbl(#ex#while#6) 2: mov [^ dst] <- 0 mov [ret] <- ^ res Translate to formal semantics Theorem generation a) Suppose S1 is the store before Line L1, then LocT(S1,dst) = false b) If S0 is the store before Line L0, and S2 is the store after Line L1, then I < Eval(S0, ^dst) or Eval(S0, ^dst+dstsize)  I => LocT(S2,I) = LocT(S0, I) c) Suppose S3 is the store before Line L2, then LocT(S3,dst) = false Theorem proving

  40. Specifications Extracted • Suppose when function strcpy() is called, the size of destination buffer (dst) is dstsize, the length of user input string (src) is srclen • Specifications that are extracted by the theorem proving approach • srclen <= dstsize • The buffers src and dst do not overlap in such a way that the buffer dst covers the string terminator of the src string. • The buffers dst and src do not cover the function frame of strcpy. • Initially, dst is not tainted Documented in Linux man page Not documented

More Related