320 likes | 464 Vues
This paper discusses a method for translating C programs to Java with a focus on memory safety and ANSI compliance. Given that C does not guarantee memory safety, vulnerabilities like buffer overflows can lead to significant security risks. By leveraging Java's memory safety guarantees, the translation process can prevent dangerous operations and memory bugs. Key techniques include representing pointers and memory blocks as Java objects, simulating pointer operations with access methods, and using fat pointers to manage memory safely. The approach aims to reduce extensive code rewriting while enabling safe execution of legacy C programs.
E N D
The Fail-Safe C to Java translator Yuhki Kamijima (Tohoku Univ.)
Background • The programming language C does not guarantee memory safety • This is the cause of memory attacks ex. buffer overflow attacks • Attackers can obtain root privilege and operate freely We want to execute C programs safely!
Background • How can we achieve memory safety? • Point : Java is memory safe We use Java to guarantee memory safety
Goal • Source to source, C to Java translation with memory safety and ANSI conformance • Save a lot of work spent on rewriting programs • Java rejects dangerous programs Prevent memory bugs and attacks
Summary • We propose one way of translating C to Java • Well-defined operations → simulate by using objects • Unsafe (= undefined) operations → raise an exception • Pointer operations • Represent pointers and memory blocks as Java objects • Simulate pointer operations by using these objects • Casts • Use access methods to access memory blocks Enables access by different types
Outline • C to Java translation by using Java objects • Representation of pointer operations • Examples of translation • Class details • Fat pointer, block • Access methods • Fat integer • Implementation of the translator • Experiments and considerations • Related work • Conclusion and future work
Representation of pointer operations • C : A pointer points to a memory block Java : The object representing a pointer refers to an object representing a memory block C Java objects pointer memory
Java classes • Pointer : FatPointer • Fields : base, offset • 1 word memory block : FatBlock • Field : contents • Method : access methods FatPointer base offset FatBlock access methods contents
Examples of translation : declaration • Regard variables as one element arrays int *p = NULL ; int a[3] ; FatBlock p = new FatBlock(1) ; FatBlock a = new FatBlock(3) ; access methods access methods p a
4 Address operation generate a new pointer which points to offset 4 of a p = &a[1] ; p.writeFat(0*4, new FatPointer(a, 1*4)) ; base offset readFat access methods access methods a p writeFat … virtual offset 0 0 4 8
4 Address operation write the pointer on offset 0 of p p = &a[1] ; p.writeFat(0*4, new FatPointer(a, 1*4)) ; access method for 1 word write base offset readFat access methods access methods a p writeFat … virtual offset 0 0 4 8
Addition of pointer and integer read the pointer contained at offset 0 of p p + 1; i = p.readFat(0*4) ; new FatPointer(i.base, i.offset+1*4) ; 1 word read base i 4 offset access methods access methods readFat a p 4 writeFat … virtual offset 0 0 4 8
Addition of pointer and integer make a new pointer which points to offset 8 of a p + 1; i = p.readFat(0*4) ; new FatPointer(i.base, i.offset+1*4) ; base 8 offset readFat access methods access methods a p 4 writeFat … virtual offset 0 0 4 8
Cast create a new pointer which points to offset 4 of a *(char *)(&a[1]) ; i = new FatPointer(a, 1*4) ; i.base.readByte(i.offset) ; i 4 readByte readFat access methods 0x12345678 a writeByte writeFat … virtual offset 0 4 8
Cast read 1 byte of data from the location i points to *(char *)(&a[1]) ; i = new FatPointer(a, 1*4) ; i.base.readByte(i.offset) ; 1 byte read i 0x12 4 offset 4 readFat access methods readByte 0x12345678 a writeByte writeFat … virtual offset 0 4 8
Outline • C to Java translation by using Java objects • Representation of pointer operations • Examples of translation • Class details • Fat pointer, block • Access methods • Fat integer • Implementation of the translator • Experiments and considerations • Related work • Conclusion and future work
Pointer operation • How to simulate a pointer which points to the middle of a memory block? • References in Java cannot point to the middle of an object × C Java • Use fat pointers
Fat Pointer [Austin et al. 94] [Oiwa et al. 01] et al. • Represent a pointer as two words • base : always points to the front of a memory block • offset : contains an integer meaning the distance from base to the address pointed to by the pointer base offset means 8 byte distance 8 the location we want to point to offset 0 4 8 12 16 20 24 28 32
FatPointer class • Simulate fat pointers in Java • base : refers to a Block object • offset : contains an integer base offset FatPointer 8 Block access methods contents virtual offset 0 4 8 12 16 20 24 28 32
Block (abstract) access methods contents Block abstract class • Simulate memory blocks • contents : contains an array of data objects • access methods : deal with memory accesses • Has concrete subclasses FatBlock ByteBlock ・・・ access methods access methods : Byte object (1 byte of data) : FatPointer object
Access methods • Memory accesses are implemented using access methods • Block class has several methods for reading and writing • readFat : 1 word read • writeFat : 1 word write • readByte : 1 byte read • writeShort : 2 byte write • … • Enables memory accesses by different types readFat readByte readShort access methods contents writeFat writeByte writeShort …
Fat Integer [Oiwa et al. 01] • Represent integers by two words • Pointers are also integers in C, generally expressed with 1 word • We represent integers by objects • FatInt class • base : always null • offset : contains integer base offset FatInt null 5
Fat class • Common parent class of FatPointer and FatInt • FatBlock contents contains Fat objects Fat (abstract) base offset FatPointer FatInt base offset base offset FatBlock access methods FatPointer or FatInt
Outline • C to Java translation by using Java objects • Representation of pointer operations • Examples of translation • Class details • Fat pointer, block • Access methods • Fat integer • Implementation of the translator • Experiments and considerations • Related work • Conclusion and future work
Implementation of the translator • Translator implemented in Objective Caml CIL [Necula et al. 02] C source code C abstract syntax tree lexer parser code generator implemented this part translator Java abstract syntax tree Java source code pretty- printer Joust [Cooper]
Experiments • Benchmark programs taken from The Computer Language Shootout • Values in the graph means overheads • Fail-Safe C to Java user time / C user time • Handwritten Java user time / C user time • Environments • 2.80GHz Intel Pentium 4 CPU, 2GB memory • Linux 2.6 • gcc : version 4.0.0 with –O2 option • javac, java : Sun JDK version 1.5.0 with –O option
Consideration • All pointers, integers and memory blocks are translated to objects • Lots of object creations and method calls • Reduce these by optimizations • Translate variables to FatBlock only if they are pointed to by a pointer • Translate integers to FatInt only if they cannot be distinguished from pointers • Eliminate redundant calls to the same access method
Outline • C to Java translation by using Java objects • Representation of pointer operations • Examples of translation • Class details • Fat pointer, block • Access methods • Fat integer • Implementation of the translator • Experiments and considerations • Related work • Conclusion and future work
Related work : C to Java translators • Jazillian [Jazillian, Inc.] • Aims at readability and maintainability • Assumes user intervention of generated code • Ephedra [Martin et al. 01] • Does not support cast of pointers between different types • Does not support memory access via pointers of different types • Our translator • Aims at 100% ANSI conformance (and more) and memory safety • Supports memory access via cast pointers by using access methods
Related work : Safe C runtime systems • CCured [Necula et al. 02] • Dynamically checks unsafe operations • Reduces overheads by static analysis (to 3 - 87 %) • Does not aim at 100% ANSI conformance • Fail-Safe C [Oiwa et al. 01] • Based on the same ideas of fat pointers and fat intergers • Both compile C to native code • If these compilers have a bug, unsafe codes are executed • Our translator translates C to Java source code • Provided that Java is safe, unsafe codes are rejected (even if our translator has a bug) • We clarify the essence of Fail-Safe C
Conclusion and future work • We propose translation from C to Java • Simulate pointers, integers and memory blocks with Java objects • Future work • Support all ANSI C and more • Optimizations