Enhancing Security of Real-World Systems with a Better Understanding of Threats

Enhancing Security of Real-World Systems with a Better Understanding of Threats Shuo Chen Ph.D. Candidate in Computer Science Center for Reliable and High Performance Computing University of Illinois at Urbana-Champaign

My Dissertation • Security Threat Analysis and Mitigations in Real-World Systems • How errors in hardware and software impose security threats to real-world systems? (common characteristics?) • How effective are current defense techniques? (substantial deficiencies?) • How to build better defenses? • Analysis-centric research approach • Study hardware memory errors  impact on system security • Software vulnerabilities reported in Bugtraq and CERT databases, source code of vulnerable applications • Current attack methods and defense techniques • Analysis results motivate the development of new defense techniques. • Many areas related to my dissertation

I as a System Hacker/Builder • Summer’01, Avaya Labs, Basking Ridge, NJ • Port Libsafe to Windows NT/2000. • Summer’02, Bell Labs, Holmdel, NJ • Detection of network denial of service attacks • Hack FreeBSD TCP/IP, network card drivers • Summer’03, Microsoft Research, Redmond, WA • Audit-enhanced authentication in Kerberos • NTOS security subsystem, Kerberos, LSA, NTDLL • Summer’04, Microsoft Research, Redmond, WA • A tracing technique to identify the dependencies of Windows applications on Administrator privileges • NTOS security subsystem, access/privilege checking, application interactions with NTOS

Outlines • Analyzing and Identifying Security Threats on Real-World Systems • Security compromises due to HW/SW memory corruptions • A type of memory corruption attacks currently believed to be rare is a realistic threat. • Deficiencies of current defense techniques • New Defense Techniques Towards a Better Security Protection • A common characteristic of memory corruption attacks: pointer taintedness • A theorem proving based program analysis • A runtime detection technique Analyses Solutions

Analyzing and Identifying Security Threats on Real-World Systems

Threat of Hardware Memory Errors Due to hardware memory errors, users can log in with arbitrary passwords • Emulate random hardware memory errors • A stochastic model to estimate such threats in real environments • Motivate other researchers to conduct physical fault injections • Java type system subverted due to random hardware memory errors.  Network server (FTP and SSH) Attacker Due to hardware memory errors, packets can penetrate firewalls Target host Attacker Firewall (IPChains and Netfilter)

Threat of Software Vulnerabilities • CERT Advisories:  66% vulnerabilities are low level memory errors in software. • Widely exploited by attackers, worms and viruses.

Embed malicious contents in input Overwrite a return address Execute malicious code State Machine Model: WU-FTP Server Attack repeat FTP_service() get an FTP command seteuid(x) Authentication; x = user ID SITE_EXEC(fn) printf(fn,…) exec(“/bin/sh”) seteuid(0)

Overwrite function pointer foo Corrupt heap structure Execute malicious code State Machine Model: NULL-HTTP Server Attack repeat HTTP_service() process HTTP header free(p) p=malloc(…) HTTP_POST() *foo() recv(p,…) exec(“/bin/sh”) seteuid(0)

Control Data Attack: Well-Known, Dominant • Control data: • data used as targets of call, return and jump. • widely understood as security critical elements • Control data attack: the most dominant form of memory corruption attacks [CERT and Microsoft Security Bulletin] • Many current defense techniques: to enforce program control flow integrity to provide security.

Non-control-data attacks • Currently very rare in reality. • One instance suggested by Young and McHugh in 1987. • How applicable are such attacks against many real-world software? • Not studied yet, but important.

An Important Question • Are attackers in general incapable to mount non-control-data attacks against many real systems? • PROBABLY NOT! • Random hardware memory errors can subvert the security of real-world systems with a non-negligible probability. • Software vulnerabilities are more deterministic and more amenable to attacks. • Each attack exploiting software vulnerabilities is composed by multiple primitive components. Allow potentially polymorphic attacks. Dangerous.

Our Claim: General Applicability of Non-control-data Attacks • We claim: • Many real-world software applications are susceptible to non-control-data attacks. • The severity of the attack consequences is equivalent to that due to control data attacks. • Validate the claim by constructing non-control-data attacks to get the root privilege on major network servers • FTP, HTTP, SSH and Telnet servers • Over 1/3 of vulnerabilities in CERT advisories • Non-control-data attacks are realistic threats.

x uninitialized, run as EUID 0 x=109, run as EUID 0 x=109, run as EUID 109. Lose the root privilege! Get a special SITE EXEC command. Exploit a format string vulnerability. x= 0, still run as EUID 109. Get a data command (e.g., PUT) x=0, run as EUID 0 x=0, run as EUID 0 Non-control-data attack against WU-FTP Server (via a format string bug) int x; FTP_service(...) { authenticate(); x = user ID of the authenticated user; seteuid(x); while (1) { get_FTP_command(...); //vulnerable if (a data command?) getdatasock(...); } } getdatasock( ... ) { seteuid(0); setsockopt( ... ); seteuid(x); } When return to service loop, still runs as EUID 0 (root). Allow me to upload /etc/passwd I can grant myself the root privilege! Only corrupt an integer, not a control data attack.

Non-control-hijacking attack against NULL-HTTP Server (via a heap overflow bug) • Attack the configuration string of CGI-BIN path. • Mechanism of CGI • suppose server name = www.foo.comCGI-BIN = /usr/local/httpd/exe • Requested URL = http://www.foo.com/cgi-bin/bar • The server executes • Our attack • Exploit the vulnerability to overwrite CGI-BIN to /bin • Request URL http://www.foo.com/cgi-bin/sh • The server executes /usr/local/httpd/exe /bar /bin /sh The server gives me a root shell! Only overwrite four characters in the CGI-BIN string. Not a control data attack.

auth = 0 auth = 0 auth = 1 auth = 1 Password incorrect, but auth = 1 Logged in without correct password Non-control-data attack againstSSH Communications SSH Server (via an integer overflow bug) void do_authentication(char *user, ...) { int auth = 0; ... while (!auth) { /* Get a packet from the client */ type = packet_read(); switch (type) { ... case SSH_CMSG_AUTH_PASSWORD: if (auth_password(user, password)) auth =1; case ... } if (auth) break; } /* Perform session preparation. */ do_authenticated(…); }

More non-control-hijacking attacks • Against NetKit Telnet server (default Telnet server of Redhat Linux) • Exploit a heap overflow bug • Overwrite two strings:/bin/login –h foo.com -p (normal scenario) /bin/sh –h –p -p (attack scenario) • The server runs /bin/sh when it tries to authenticate the user. • Against GazTek HTTP server • Exploit a stack buffer overflow bug • Send a legitimate URL http://www.foo.com/cgi-bin/bar • The server checks that “/..” is not embedded in the URL • Exploit the bug to change the URL to http://www.foo.com/cgi-bin/../../../../bin/sh • The server executes /bin/sh

Implications of Non-Control-Data Attacks • Control flow integrity is not a sufficiently accurate approximation to software security. • Many types of non-control data critical to security • Once attackers have the incentive, they are likely to succeed in non-control-data attacks.

Re-Examining Current Defense Techniques • Many of them are based on control flow integrity • Monitor system call sequences • Protect control data • Non-executable stack and heap • Pointer encryption PointGuard • Address space randomization • StackGuard, Libsafe and FormatGuard • Building a generic and secure defense technique: still an open problem.

Pointer Taintedness Detection: Towards a Better Security Protection for Real-World Systems

Pointer Taintedness • Pointer Taintedness: a pointer value, including a return address, is derived from user input. • Most memory corruption attacks are due to pointer taintedness. • Pointer taintedness: a unifying perspective for reasoning about security vulnerabilities.

Most Memory Corruption Attacks are Due to Pointer Taintedness • Format string attack • Taint an argument pointer of functions such as printf, sprintf and syslog. • Stack buffer overflow (stack smashing) • Taint a frame pointer or a return address. • Heap corruption • Taint the free-chunk doubly-linked list maintaining the heap structure. • globbing attack • User input resides in a location that is used as a pointer by the parent function of glob().

Internals of Stack Buffer Overflow Attacks Frame pointer or return address can be tainted. Vulnerable code: char buf[100]; strcpy(buf,user_input); High Return addr Frame pointer buf[99] … buf[1] buf[0] user_input Stack growth buf Low

fmt: format string pointer ap: argument pointer fmt: format string pointer ap: argument pointer Internals of Format String Attacks Vulnerable code: recv(buf); printf(buf); /* should be printf(“%s”,buf) */ \xdd \xcc \xbb \xaa %d %d %d %n High … %n %d %d %d 0xaabbccdd Stack growth Low In vfprintf(), if (fmt points to “%n”) then **ap = (character count) *ap is a tainted value.

Internals of Heap Corruption Attacks Free chunk A Vulnerable code: buf = malloc(1000); recv(sock,buf,1024); free(buf); Allocated buffer buf user input Free chunk B fd=A bk=C In free(): B->fd->bk=B->bk; B->bk->fd=B->fd; Free chunk C When B->fd and B->bk are tainted, the effect of free() is to write a user specified value to a user specified address.

Building Defense Techniques based on Pointer Taintedness • Static code analysis: analyze the source code to extract the conditions under which the possibility of pointer taintedness exists. • To uncover potential vulnerabilities • Runtime detection: monitor at runtime whether a tainted value is dereferenced as a pointer. • To defeat memory corruption attacks

Static Analysis about Pointer Taintedness: To Extract Security Specifications of Library Functions IFIP International Information Security Conference 2004

Library function specifications are crucial to secure programming • Library function specifications are specified empirically • printf(fmt,…), strcpy(d,s), free(p), glob(p), strtok(s,del), savestr(p), …. • Formal and complete specifications required by compiler techniques to check application source code for security. • A unified reason why these specifications are required • Required to eliminate pointer taintedness. • Extraction of security specifications of a function is reduced to a theorem proving task

Semantics of Pointer Taintedness • Formal definition of program semantics is required for theorem proving. • Currently defined using an equational logic framework • Taintedness-aware memory model • The logic framework defines operations to fetch the content and test the taintedness (true/false) of each memory location. • Incorporate pointer taintedness into program semantics • Define program semantics at the assembly level to reason about memory layout. • Load/Store/ALU instructions: propagate taintedness from source data to destination data. • Input functions (scanf, recv and recvfrom) • Axiom: The memory locations in the receiving buffer are tainted immediately after these function calls.

Extracting Function Specifications by Theorem Prover Automatically translated to formal semantic representation C source code of a library function formal semantic representation Theorem generation For each pointer dereference in an assignment, generate a theorem stating that the pointer is not tainted Theorem proving A set of sufficient conditions that imply the validity of the theorems. They are the security specifications of the analyzed function.

int vfprintf (FILE *s, const char *format, va_list ap) { char * p, *q; int done,data,n,state; char buf[10]; p=format; done=0; if (p==NULL) return 0; state=NO_PENDING; while (*p != 0) { if (state==NO_PENDING) { if (*p=='%') state=PENDING; else outchar(s,*p); } else { switch (*p) { case '%': outchar(s,'%') break; case 'd': data=va_arg (ap, int); if (data<0) { outchar(s,'-'); data=-data; } n=0; while (data>0 && n<10) { buf[n]=data%10+'0'; data/=10; n++; } while (n>0) { n--; outchar(s,buf[n]); } break; case 's': q=va_arg (ap, char *); if (q==NULL) break; while (*q!=0) { outchar(s,*q) q++; } break; case 'n': q= va_arg(ap,void*) ; *(int*) q = done; break; default: outchar(s,*p) } state=NO_PENDING; } p++; } return done; } Example: vfprintf() Theorem1: buf+n should not be a tainted value Theorem2: q should not be a tainted value

Suggest the scenario of format string vulnerability Extracting the Specifications of vfprintf() • Try to prove the two theorems • The theorem prover cannot complete the proof initially • only valid under certain preconditions. • Add these preconditions as axioms to the theorem prover. • Repeat until both theorems are proved. • Four preconditions are added: the specifications of vfprintf (FILE *s, const char *format, va_list ap) • ap never points to any location within the current function frame. • *ap never points to the location of variable ap, i.e., *ap  &ap • Suppose the memory segment that ap sweeps over is called ap_activitiy_range, then *ap never points to any location within ap_activitiy_range. • No locations within ap_activitiy_range are tainted before vfprintf() is called. iterate

Other Studied Examples • Function strcpy() • Four security specifications indicating buffer overflow, buffer overlapping and buffer underflow scenarios causing pointer taintedness. • Function free() of a heap management system • Seven security specifications are extracted, including several specifications indicating heap corruption vulnerabilities. • Socket read functions of Apache HTTP Server and NULL HTTP Server • Apache function is proven to be free of pointer taintedness. • Two (known) vulnerabilities are exposed in the theorem proving process of NULL HTTP Server function.

Runtime Pointer Taintedness Detection: To Defeat Memory Corruption Attacks To appear in IEEE Conference on Dependable Systems and Networks, 2005.

The Technique • A processor architectural level mechanism to detect pointer taintedness • Implemented on SimpleScalar simulator • Architectural implementation of pointer taintedness semantics • To show the validity of pointer taintedness concept on whole programs of real applications • Network servers • SPEC 2000 integer benchmarks

Evaluations on Real-World Software • Evaluation • Effectiveness of detection • No false alarm in any application evaluated • Transparent to applications • A small number of potential attack scenarios undetected. • Pointer taintedness detection can be applied to the whole program of real software • offers a substantial improvement on security protection.

Conclusions

Conclusions • Many real-world software can be compromised by corrupting non-control data. • It is insufficient to rely on control flow integrity for software security. • Pointer taintedness is a unifying perspective to reason about most memory corruption vulnerabilities/attacks. • Reasoning about pointer taintedness is a promising direction to enhance security on real-world systems • A theorem proving based code analysis approach • A runtime pointer taintedness detection mechanism

Future Directions • Short term goals • Provide a higher degree of automation for the theorem proving technique. • Reduce the intrusiveness of the runtime pointer taintedness detection technique • Combine with the theorem proving technique. The processor only checks function preconditions. • Long term goals • Extract programming styles susceptible to security attacks. Can compilers detect bad programming styles? • Identify a broader range of non-traditional security threats. • Study historical data about how security vulnerabilities were discovered, reported and patched. • Decompose the behaviors of viruses, worms and rootkits to a number of basic building blocks.

Summary of My Research Methodology • Analysis-centric approach • A significant amount of effort in my dissertation is on analysis. • Starting from the reality (usually a mess) to define problems! • I am a data analysis person • Excited to analyze real data and incidents • Tedious? Sometimes, but it is a step toward a lot of fun. • Rewarding? Definitely. Especially important for systems research. • Goal: strongly motivate research topics that solve problems in the reality.

Backup Slides

Related Work • Perl security • Shankar and Wagner (2001) • Static analysis to uncover format string vulnerabilities • Our work: pointer taintedness (Aug. 2004) • Reasoning taintedness using an extended memory model • Pointer taintedness as the root cause • Secure Program Execution (MIT), Minos (UC-Davis) and TaintCheck (CMU) (late 2004 and early 2005) • Similar memory model • Taintedness of control data • Taintedness: cause or result of memory corruption?

Static and Dynamic Approaches • Static approaches (avoid producing memory vulnerabilities in programs) • Writing code with type safe language • Compiler techniques to uncover memory vulnerabilities • Compiler instruments source code according to program annotations. • Challenges: legacy code and low level code, compatibility and performance. • Fact: Memory vulnerabilities are still constantly discovered and exploited. • Intrusion detection techniques (defeat attacks, given the existence of vulnerabilities) • Specialized techniques • Defeat stack buffer overflow and format string attacks. • Generic defense techniques • Most techniques are designed to defeat control-hijacking attacks. Host intrusion detection system and control flow integrity protection techniques. very active research area. • Others have constraints and difficulties in their deployments. (pointer encryption and address randomization)

One-Slide Intro to Equational Logic • Use term rewriting to establish proofs of theorems. • Natural number addition expressed in the Maude system. 0 : Natural . s_ : Natural -> Natural . _+_ : Natural Natural -> Natural . vars N M : Natural Axiom: N + 0 = N . Axiom: N + s M = s (N + M) . (s s s 0) + (s s 0) = s ((s s s 0) + (s 0)) = s( s((s s s 0) + 0)) = s(s((s s s 0)) = s s s s s 0 Intuitively, this is a proof of “3 + 2 = 5” in natural number algebra.

Taintedness-Aware Memory Model • Astore represents a snapshot of the memory state at a point in the program execution. • For each memory location, we can evaluate two properties: content and taintedness (true/false). • Operations on memory locations: • The fetch operation Ftch(S,A)gives the content of the memory address A in store S • The location-taintedness operation LocT(S,A) gives the taintedness of the location A in store S • Operations on expressions: • The evaluation operation Eval(S,E)evaluates expression E in store S • The expression-taintedness operation ExpT(S,E) computes the taintedness of expression E in store S.

Axioms of Eval and ExpT operations Eval(S, I) = I // I is an integer constant Eval(S, ^ E1) = Ftch(S, Eval(S,E1)) Eval(S, E1 + E2) = Eval(S, E1) + Eval(S, E2) Eval(S, E1 - E2) = Eval(S, E1) - Eval(S, E2) … … ExpT (S, I) = false ExpT(S, ^ E1) = LocT(S,Eval(S,E1)) ExpT(S,E1 + E2) = ExpT(S,E1) or ExpT(S,E2) ExpT(S,E1 - E2) = ExpT(S,E1) or ExpT(S,E2) … … E.g., is the expression (^100)–2 tainted in store S? ExpT(S, (^100)–2) = ExpT(S, (^100)) or ExpT(S, 2) = LocT(S,100) or false = LocT(S,100) Note: ^ is the dereference operator, ^100 gives the content in the location 100

Semantics of My Assembly Language • The following instructions are defined: • mov [Exp1] <- Exp2 • branch (Condition) Label • call FuncName(Exp1,Exp2,…) • Axioms defining mov instruction semantics • Specify the effects of applying mov instruction on a store • Allow taintedness to propagate from Exp2 to [Exp1]. Ftch((S ; mov [E1] <- E2),X1) = Eval(S,E2) if (Eval(S,E1) is X1) . Ftch((S ; mov [E1] <- E2),X1) = Ftch(S,X1) if not (Eval(S,E1) is X1) . LocT((S ; mov [E1] <- E2),X1) = ExpT(S,E2) if (Eval(S,E1) is X1) . LocT((S ; mov [E1] <- E2),X1) = LocT(S,X1) if not (Eval(S,E1) is X1) . • Axioms defining the semantics of recv (similarly, scanf, recvfrom: user input functions) • Specify the memory locations tainted by the recv call.

Example: strcpy() char * strcpy (char * dst, char * src) { char * res; 0: res =dst; while (*src!=0) { 1: *dst=*src; dst++; src++; } 2: *dst=0; return res; } 0: mov [res] <- ^ dst lbl(#while#6) branch (^ ^ src is 0) #ex#while#6 1: mov [^ dst] <- ^ ^ src mov [dst] <- (^ dst) + 1 mov [src] <- (^ src) + 1 branch true #while#6 lbl(#ex#while#6) 2: mov [^ dst] <- 0 mov [ret] <- ^ res Translate to formal semantics Theorem generation a) Suppose S1 is the store before Line L1, then LocT(S1,dst) = false b) If S0 is the store before Line L0, and S2 is the store after Line L1, then I < Eval(S0, ^dst) or Eval(S0, ^dst+dstsize)  I => LocT(S2,I) = LocT(S0, I) c) Suppose S3 is the store before Line L2, then LocT(S3,dst) = false Theorem proving

Specifications Extracted • Suppose when function strcpy() is called, the size of destination buffer (dst) is dstsize, the length of user input string (src) is srclen • Specifications that are extracted by the theorem proving approach • srclen <= dstsize • The buffers src and dst do not overlap in such a way that the buffer dst covers the string terminator of the src string. • The buffers dst and src do not cover the function frame of strcpy. • Initially, dst is not tainted Documented in Linux man page Not documented

Enhancing Security of Real-World Systems with a Better Understanding of Threats

Enhancing Security of Real-World Systems with a Better Understanding of Threats

Presentation Transcript

Real World Network Security

Enhancing Security of Real-World Systems with a Better Understanding of the Threats

Network Security: Security, Threats

Understanding the Real-World Performance of Carrier Sense

Threats of Computing in a Virus-Filled World

Testing the Security of Real-World Electronic Voting Systems

Understanding Real-world Ontologies

Understanding Emerging Threats: The case of Nugache

Security Threats

Get A Better Understanding Of Coffee With These Great Tips!

The Relation of Human Error with Cyber Security Threats

Security Threats

Getting a Better Understanding of Dercums Disease

Computer Security: Understanding The Threats & Solutions

Understanding the Use of Calculus in Real World

Cyber Security Threats of Today - HenkTek

Enhancing Security with a Reliable Security Camera System

A better understanding of the causes of Black Gums

Enhancing Security: The Power of Biometric Access Control Systems

Enhancing Security Systems with Advanced Machine Vision Sensors

Enhancing Security with CP Plus A Comprehensive Review of CP Plus Security Cameras

Understanding ESG Investing: Investing for a Better World