1 / 28

Modern Compiler Internal Representations

Modern Compiler Internal Representations. Silvius Rus 1/23/2002. Presentation Navigator. Introduction Challenges Staged compilation Generate efficient code Case studies Conclusions. Traditional Compiler Organization. Pass: output type Read code as text: ASCII characters

ralstonb
Télécharger la présentation

Modern Compiler Internal Representations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Modern Compiler Internal Representations Silvius Rus 1/23/2002

  2. Presentation Navigator • Introduction • Challenges • Staged compilation • Generate efficient code • Case studies • Conclusions

  3. Traditional Compiler Organization • Pass: output type • Read code as text: ASCII characters • Lexical scanner: language words • Syntactic parser: language phrases • Translation: attribute grammar phrases • Output generated code: binary stream • Focus on pipelining due to memory window constraints

  4. Traditional Compiler Internal Representation • Grammatical structure not always built explicitly • Implicit, built-in semantics • Simple data structures: • Transition tables • Token streams and stacks

  5. Presentation Navigator • Introduction • Challenges • Staged compilation • Generate efficient code • Case studies • Conclusions

  6. Compiler Challenges • Versatile: • Understand multiple languages • Generate output for various architectures • Generated efficient code: • Fast: as fast as coded directly in the output language • Portable: runs on multiple platforms • Verifiable: runs provably within a specified class of behavior • Secure: provably respects certain security requirements • Extendable: need to extend in order to: • Incorporate new input language and/or target system • Take advantage of advances in run-time environments (such as ISA changes, multithreading, distributed/parallel execution) L+A < L*A

  7. Understand Multiple Languages - Output for Multiple Targets • Abstract IR: • Same representation for Fortran, C, C++, Java, … • Possible only for conceptually similar languages • Good points: • Perform complex transformations on a single representation • Bad points: • Language semantics may either get lost or need additional particular representation • Specific architecture characteristics are more profitable to use than common (abstractable) ones

  8. Presentation Navigator • Introduction • Challenges • Staged compilation • Generate efficient code • Case studies • Conclusions

  9. Staged Compilation • Stage 1: • Load source file (text) into IR1 – machine independent • Optimize IR1 • Stream IR1 to text file • Save/reload, pipe, HTTP, … text file • SUIF files, Java bytecode, .NET assembly • Stage 2: • Load text file into IR2 – machine dependent • Perform machine specific optimization on IR2 • Generate executable code or interpret IR2

  10. Staged Compilation

  11. Staged Compilation • Prepare IR1 so that stage 2 is very cheap • Quicksilver • Insert templated optimized object code in bytecode • Pack speculative optimization validation predicates in bytecode • Keep method dependence graphs explicitly in bytecode • Microsoft .NET • Explicit type/class information in IL • Preformatted, quickly accessible metadata • Strings, tables, heaps • Custom data • Allow embedding of native code

  12. Presentation Navigator • Introduction • Challenges • Staged compilation • Generate efficient code • Case studies • Conclusions

  13. Generate Fast And Portable Code • Fast code • IR close to machine structure • Mapping data to registers • Mapping operations to opcodes • Scheduling instructions for superscalar/VLIW processors • Portable code • Machine description must be totally abstracted • QuickSilver: templated optimized code

  14. Generate Verifiable Code • Microsoft .NET IL • Static and dynamic type safety - reflections • Managed code • Carries a minimum of information on itself • Usually signed by compiler in Stage 1 • Managed data • Only accessible from managed code • Garbage collected • Managed pointers

  15. Generate Secure Code • Hard to define limits • Make sure you run what you mean to • Limit rights • Per user • Per software component • QuickSilver: digests • .NET IL: • Code is signed using encrypting of hashed original • Permissions are set per module

  16. Generate Efficient Code • IR may also provide support for: • Versioning (Quicksilver, .NET) • Culture (.NET)

  17. Presentation Navigator • Introduction • Challenges • Staged compilation • Generate efficient code • Case studies • Conclusions

  18. Compiler Internal Representation - General Organization • High-level - completely machine independent • Abstract Syntax Tree • Control Flow Graph • Control Dependence Graph • Data Dependence Graph • Static Single Assignment • Medium-level - dependent on classes of machines • Virtual machine code, such as stack machine • Low level - dependent on particular ISA • Assembly, machine instruction graphs

  19. Case Study: Polaris • High level representation • Abstract Syntax Tree • Control Flow Graph • Control Dependence Graph • Data Dependence Graph • Gated Static Single Assignment • Some generality • Backends for various parallel execution systems

  20. Case Study: SUIF2 • Multiple level representation • CFG, CDG, … • Quads • Machsuif • Custom annotations • Multiple frontends: Fortran, C, Java • Multiple backends: SUIF VM, C, assembly • Decoupled passes communicate only via SUIF • Extendable: OSUIF

  21. Case Study: Promis • Switch to Promis organization presentation • Switch to Promis IR presentation

  22. Case Study: KCC • Kook and Associates (KAI) C++ compiler: • C++ dedicated internal representation • Advanced C++ specific optimization • Proprietary C++ specific object format • Interprocedural optimization with modular compilation • C++ specific debug information – usable with KDB • Outputs C with calls to proprietary run-time library • Uses GNU gcc to generate machine code

  23. Case Study: Jalapeno QuickSilver • Quasi-static images • Java bytecode + proprietary format • Representation allows for optimizations • Explicit method dependence graph • Templated optimized object code • Speculative optimization validation predicates

  24. Case Study: .NET • Advertised 9 digit $$ figure project • CLI (ECMA standard) • Common type system • Type info in intermediate code • Common exception system • Throw in Visual Basic, catch in C++ • Support for security, culture, versioning • Support for charging per-use • Custom info can be passed for original language specific description 30+ languages MSIL native code

  25. Other Compilers – Open Source • GNU compiler: • C, Fortran, Java, C++ front-ends • Generates code for all major architectures • Low level internal representation • New version (3.x) has SSA • SGI open source project: discontinued

  26. Other Compilers – Commercial • Fortran, C, C++, Java produced by OS and/or hardware producers • HP, SGI, Intel, Microsoft, SUN • Other commercial compiler producers: • Borland, Watcom, etc. • Internal representation – company secret

  27. Presentation Navigator • Introduction • Challenges • Staged compilation • Generate efficient code • Case studies • Conclusions

  28. Conclusions • Internal representation evolved • Programming paradigms • Changes in hardware • Changes in compiler/run-time system technology • New issues: security, verifiability, culture, versioning • Tendency: E Pluribus Unum

More Related