Exploring Just-In-Time (JIT) Compilation in Computing
Delve into the world of Just-In-Time (JIT) compilation, its definitions, benefits, limitations, security aspects, history, and applications in Java, mobile, and web environments. Learn about compilers, interpreters, and JIT's impact on performance and memory usage.
Exploring Just-In-Time (JIT) Compilation in Computing
E N D
Presentation Transcript
JIT Zak Sang, Laura Tebben
Agenda • Definitions: Compilers vs. interpreters vs. JIT • Timeline • Benefits of JIT • Limitations of JIT • Security of JIT • Applications • Java • Mobile • Web
Introduction From Wikipedia: “In computing, just-in-time (JIT) compilation (also dynamic translation or run-time compilations) is a way of executing computer code that involves compilation during execution of a program – at run time – rather than prior to execution. Most often, this consists of source code or more commonly bytecode translation to machine code, which is then executed directly.”
Compiler • Translates all source code to binary machine code upfront • Requires a lot of memory since object code is generated • Very performant since translation work is done ahead of time
Interpreter • Also translates source code to machine code • but does so at runtime • translation can be done at the line, statement, or instruction level • Requires less memory since object code is not produced • Instead machine code is generated as needed • Generally less performant than compiled programs because translation happens at runtime* • * optimization specific to running environment can be used to help this
JIT Compiler • Translates source code to an IR (eg: bytecode), which is dynamically translated to machine code at run time • Adds more optimization than interpreters • Combines benefits of interpreters and traditional compilers • Translation to IR gets some of the translation work out of the way and saves computation at runtime • Optimizations happen at run-time for frequently used portions of code • Platform-specific optimizations can be used • Allows for more flexibility
More Info on JIT • AOT compilation to bytecode • Parse source code, basic optimizations • Bytecode • Interpreted by VM • JIT compiler dynamically translated bytecode to native machine code • cached
Timeline - Compilers • 1950s - Early compiled languages used to allow for higher level languages • FORTRAN • 1957 - First commercially available compiler • COBOL • 1960 - First cross-platform compiler compiled to RVAC 501 and RCA II • Performance optimization • 1970 - BLISS - A systems language eventually overtaken by C • Late 1980s - Programming directly in assembly starts to decline
Timeline - Interpreters • 1952 - Interpreters introduced to improve working with limited computers • Limited program storage space • ~1960 - LISP ‘eval’ function implemented by Steve Russell • Evaluate individual LISP expressions • Theorized in design by John McCarthy • 1975 - Microsoft BASIC • Popular on personal computers (1970s-1980s)
Timeline - JIT Compilers • ~1960 - LISP • translation at runtime • related to interpreters • 1968 - Regex in QED editor (Ken Thompson) • 1983 - Smalltalk • translation on demand • with caching of compiled code • resulted in the Self dialect • ~1993 - Java • Patterns derived from Self/ Smalltalk
Timeline - JIT Compilers • ~2009 - JIT on the Web • Firefox 3.5 (TraceMonkey) • ~2009-2010 - JIT on Android • Tracing added to DVM in 2010 • ~2014 - JIT removed from Android • Replaced with native AOT compilation • Improved performance • Improved power consumption • 2016 - JIT added back to Android • Improves performance at runtime • Saves storage space for apps • Speeds up updates (since apps aren’t recompiled upfront on each update)
Compiled Code • Quick program startup time • Quick runtime • Compilation overhead incurred only once
Interpreted Code • Faster to develop in • More flexible (https://en.wikipedia.org/wiki/Interpreted_language) • Platform independent • Dynamic typing • Dynamic scoping • Reflection • Possibly a smaller executable program size
JIT - Storage • More portable since object code isn’t generated • Hardware agnostic • Bytecode is compact • Can potentially save a lot of memory because only parts of the program that are used are JIT-compiled
JIT - Performance • Faster startup time than compiling AOT • NOT faster than executing precompiled binary • Initial translation to IR makes JIT-compiled code faster than interpreters • Translation caching can help reduce the runtime translation effort • JIT compilation allows for platform-specific optimizations • Adaptive optimization uses runtime information to further optimize the generated machine code
JIT - Optimizations • Optimize to the CPU • Optimize to the version of the OS • Change compilation based on usage stats • Inline library functions • Can easily rearrange code for better cache utilization
Compiled Code • Lack of cross-platform support • Can’t optimize for runtime behavior • Slower to develop in compiled languages
Interpreted Code • Slower runtime than compiled code • Must ship source code • Prone to programmer error with regards to typing • More susceptible to code injection attacks
JIT - More complex implementation • Requires multiple steps • AOT compile to bytecode • Bytecode interpreted (HotSpot) • Compiled to machine code at runtime when threshold is hit • Recompiled with higher optimization when another threshold is hit • Deoptimized if assumptions change • Could drastically increase compilation time
JIT - Heavier runtime • Like interpreters, JIT compilers require more work to be done during runtime which may take resources away from the main program • Increased memory, CPU usage • Often results in slower run speeds • Delay when compiling to machine code
Security • JIT requires executable pages containing arbitrary code to be generated • Code can be crafted to generate nop sleds and arbitrary code • Can be used with an arbitrary code execution vulnerability to run the generated code
push var a load push pop push ... ... ...
push var a load push pop push ... ... ...
push var a load var b push pop push ... ... ...
push var a load push pop push ... ... ...
push var a load var c push pop push ... ... ...
push var a load var c push pop push ... ... ...
push var a load var c push pop push ... ... ...
push var a load var c push pop push ... ... ...
Security - Executable Pages • JIT requires writable pages to be executable since machine code is generated at runtime (data is code) • Now arbitrary code can be generated (JIT Spray) • http://www.semantiscope.com/research/BHDC2010/BHDC-2010-Paper.pdf • Potential Vulnerability: • Attacker finds buffer overflow • Buffer is constructed containing valid instructions • Attacker redirects execution to these instructions
https://media.blackhat.com/bh-us-11/Rohlf/BH_US_11_RohlfIvnitskiy_Attacking_Client_Side_JIT_Compilers_Slides.pdfhttps://media.blackhat.com/bh-us-11/Rohlf/BH_US_11_RohlfIvnitskiy_Attacking_Client_Side_JIT_Compilers_Slides.pdf
Security - Executable Pages in JIT (solutions) • Have pages only be writable or executable • https://jandemooij.nl/blog/2015/12/29/wx-jit-code-enabled-in-firefox/
Security - Executable Pages in JIT (solutions) • JIT Emission Randomization • Similar to ASLR, but for output of JIT compiler • Can randomize spacing and ordering of functions • Randomization may still be predictable
Security - Executable Pages in JIT (solutions) • Constant folding - break constants into scatter 2-byte chunks an reassemble at runtime • Can be defeated if scattering is predictable • Not effective on some instructions
Security - Executable Pages in JIT (solutions) • Constant blinding - xor constants with a secret • Used by most browsers
References • https://www.ibm.com/support/knowledgecenter/en/SSYKE2_8.0.0/com.ibm.java.vm.80.doc/docs/jit_overview.html • https://en.wikipedia.org/wiki/Just-in-time_compilation • https://www.ibm.com/support/knowledgecenter/en/SSYKE2_8.0.0/com.ibm.java.vm.80.doc/docs/jit_optimize.html • http://www.cs.columbia.edu/~aho/cs6998/Lectures/14-09-22_Croce_JIT.pdf • https://aboullaite.me/understanding-jit-compiler-just-in-time-compiler/ • https://en.wikipedia.org/wiki/Optimizing_compiler#History • https://en.wikipedia.org/wiki/History_of_compiler_construction • https://www.linuxjournal.com/forums/history-compilers • http://www.semantiscope.com/research/BHDC2010/BHDC-2010-Paper.pdf • https://jandemooij.nl/blog/2015/12/29/wx-jit-code-enabled-in-firefox/
References (contd) • https://media.blackhat.com/bh-us-11/Rohlf/BH_US_11_RohlfIvnitskiy_Attacking_Client_Side_JIT_Compilers_Slides.pdf • https://www.ibm.com/support/knowledgecenter/zosbasics/com.ibm.zos.zappldev/zappldev_85.htm • http://time.com/69316/basic/ • https://en.wikipedia.org/wiki/Interpreted_language • https://source.android.com/devices/tech/dalvik/jit-compiler.html • https://en.wikipedia.org/wiki/Android_Runtime
Agenda • Native (Java) • Mobile (Android) • Web (JavaScript)
The basics • Before runtime • Translates source code to IR (bytecode) • At runtime • Compiles bytecode to machine code • Determines semantics of individual bytecodes • Using trees • Saves the compilation results for future use
HotSpot JVM • Interpreter • Template-based • Maps machine code to each bytecode instruction • Tiered compilation • Client compiler (C1): • Focus is compilation speed • Limited set of optimizations • Server compiler (C2): • Focus is code performance • Heavy overhead
HotSpot JVM • C1 and C2 compilers • C1 • 3 phases • Front end constructs high level IR from bytecode (platform independent) • Uses static single assignment form to enable optimizations • Back end generates low level IR from high level IR (platform dependent) • Register allocation, optimization, machine code • C2 • Fully optimizing compiler • Uses static single assignment IR • Does typical compiler optimizations • Also some Java-specific optimizations • Null-check elimination, range-check elimination, etc