1 / 33

An Introduction to Open64 Compiler

An Introduction to Open64 Compiler. Guang R. Gao (Capsl, Udel) Xiaomi An (Capsl, Udel) Curtsey to Fred Chow. Outline. Background and Motivation Part I: An overview of the Open64 compiler infrastructure and design principles Part II: Using Open64 in compiler research & development.

holland
Télécharger la présentation

An Introduction to Open64 Compiler

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Introduction to Open64 Compiler Guang R. Gao (Capsl, Udel) Xiaomi An (Capsl, Udel) Curtsey to Fred Chow

  2. Outline • Background and Motivation • Part I: An overview of the Open64 compiler infrastructure and design principles • Part II: Using Open64 in compiler research & development Open64 Tutorial - An Introduction

  3. What is The Original Open64 (Pro64) ? • A suite of optimizing compiler tools for Intel IA-64, x86-64 on Linux systems • C, C++ and Fortran90/95 compilers • Conforming to the IA-64, x86-64 Linux ABI and API standards • Open to all researchers/developers in the community Open64 Tutorial - An Introduction

  4. Stanford RISC compiler research Cydrome Cydra5 Compiler MIPS Ucode Compiler (R2000) SGI Ragnarok Compiler (R8000) MIPS Ucode Compiler (R4000) SGI MIPSpro Compiler (R10000) Pro64/Open64 Compiler (Itanium) Historical Perspectives Software pipelining 1980-83 1989 Global opt under -O2 Floating-pt performance 1987 1994 Loop opt under -O3 Stanford SUIF Rice IPA 1991 1997 Curtsey to Fred Chow 2000

  5. Who Might Want to Use Open64? • Researchers: test new compiler analysis and optimization algorithms • Developers : retarget to another architecture/system • Educators: a compiler teaching platform Open64 Tutorial - An Introduction

  6. Who Are Using Open64 – from ACM CGO 2008 Open64 Workshop Attendee List (April 6, 2008, Boston) 12 Companies: Google, nVidia, IBM, HP, Qualcomm, AMD, Tilera, PathScale, SimpLight, Absoft, Coherenet Logix, STMicro, 9 Education Institutes: USC, Rice, U. of Delaware, U. Houston, U. of Illinois, Tsinghua, Fudan, ENS Lyon, NRCC,

  7. Vision and Status of Open64 Today ? • People should view it as GCC with an alternative backend with great potential to reclaim the best compiler in the world • The technology incorporated all top compiler optimization research in 90's • It has regain momentum in the last three years due to Pathscale and HP's investment in robustness and performance • Targeted to x86, Itanium in the public repository, ARM, MIPS, PowerPC, and several other signal processing CPU in private branches Open64 Tutorial - An Introduction

  8. Overview of Open64 Infrastructure • Logical compilation model and component flow • WHIRL Intermediate Representation • Very High Optimizer • Inter-Procedural Analysis (IPA) • Loop Nest Optimizer (LNO) and Parallelization • Global optimization (WOPT) • Code Generation (CG) Open64 Tutorial - An Introduction

  9. Front end Very High Optimizer Interprocedural Analysis and Optimization Good IR Loop Nest Optimization and Parallelization Global (Scalar) Optimization Middle-End Backend Code Generation Open64 Tutorial - An Introduction

  10. Front Ends • C front end based on gcc • C++ front end based on g++ • Fortran90/95 front end from MIPSpro Open64 Tutorial - An Introduction

  11. Semantic Level of IR At higher level: • More kinds of constructs • Shorter code sequence • More program info present • Hierarchical constructs • Cannot perform many optimizations High Source program At lower level: • Less program info • Fewer kinds of constructs • Longer code sequence • Flat constructs • All optimizations can be performed Low Machine instruction Open64 Tutorial - An Introduction

  12. Compilation Flow Open64 Tutorial - An Introduction

  13. Very High WHIRL Optimizer Lower to High WHIRL while performing optimizations First part deals with common language constructs • Bit-field optimizations • Short-circuit boolean expressions • Switch statement optimization • Simple if-conversion • Assignments of small structs: lower struct copy to assignments of individual fields • Convert patterns of code sequences to intrinsics: • Saturated subtract, abs() • Other pattern-based optimizations • max, min Open64 Tutorial - An Introduction

  14. Roles of IPA The only optimization component operating at program scope • Analysis: collect information from entire program • Optimization: performs optimizations across procedure boundaries • Depends on later phases for full optimization effects • Supplies cross-file information for later optimization phases Open64 Tutorial - An Introduction

  15. IPA Flow Open64 Tutorial - An Introduction

  16. IPA Main Stage • Analysis • alias analysis • array section • code layout • Optimization • inlining • cloning • dead function and variable elimination • constant propagation Open64 Tutorial - An Introduction

  17. Loop Nest Optimizations Transformations for Data Cache Transformations that help other optimizations Vectorization and Parallellization

  18. LNO Transformations for Data Cache • Cache blocking • Transform loop to work on sub-matrices that fit in cache • Loop interchange Array Padding • Reduce cache conflicts Prefetches generation • Hide the long latency of cache miss references Loop fusion Loop fission Open64 Tutorial - An Introduction

  19. LNO Transformations that Help Other Optimizations Scalar Expansion / Array Expansion • Reduce inter-loop dependencies, enable parallelization Scalar Variable Renaming • Less constraints for register allocation Array Scalarization • Improves register allocation Hoist Messy Loop Bounds Outer loop unrolling Array Substitution (Forward and Backward) Loop Unswitching Hoist IF Inter-iteration CSE Open64 Tutorial - An Introduction

  20. LNO Parallelization • SIMD code generation • Highly dependent on the SIMD instructions in target • Generate vector intrinsics • Based on the library functions available • Automatic parallelization • Leverage OpenMP support in rest of backend Open64 Tutorial - An Introduction

  21. Global Optimization Phase • SSA is unifying technology • Open-64 extension to SSA technology • Representing aliases and indirect memory operations (Chow et al, CC 96) • Integrated partial redundancy elimination (Chow et al, PLDI 97; Kennedy et al, CC 98, TOPLAS 99) • Support for speculative code motion • Register promotion via load and store placement (Lo et al, PLDI 98) Open64 Tutorial - An Introduction

  22. Overview • Works at function scope • Builds control flow graph • Performs alias analysis • Represents program in SSA form • SSA-based optimization algorithms • Co-operations among multiple phases to achieve final effects • Phase order designed to maximize effectiveness • Separated into Preopt and Mainopt • Pre-opt serves as pre-optimizing front-ends for LNO and IPA (in High WHIRL) • Provide use-def info to LNO and IPA • Provide alias info to CG Open64 Tutorial - An Introduction

  23. Optimizations Performed Pre-optimizer • Goto conversion • Loop normalization • Induction variable canonicalization • Dead store elimination • Copy propagation • Dead code elimination • Alias analysis (flow-free and flow-sensitive) • Compute def-use chains for LNO and IPA • Pass alias info to CG Main optimizer • Partial redundancy elimination based on SSAPRE framework • Global common subexpression • Loop invariant code motion • Strength reduction • Linear function test replacement • Value-number-based full redundancy elimination • Induction variable elimination • Register promotion • Bitwise dead store elimination Open64 Tutorial - An Introduction

  24. Feedback • Used throughout the compiler • Instrumentation can be added at any stage • VHO, LNO, WOPT, CG • Explicit instrumentation data incorporated where inserted • Instrumentation data maintained and checked for consistency through program transformations. Open64 Tutorial - An Introduction

  25. Smooth Info flow into Backend in Pro64 WHIRL WHIRL-to-TOP CGIR Information from Front end (alias, structure, etc.) Hyperblock Formation Critical Path Reduction Extended basic block optimization Inner Loop Opt Software Pipelining Control Flow optimization IGLS GRA/LRA Code Emission Executable \course\cpeg421-10s\Topic2a.ppt

  26. Software Pipelining vsNormal Scheduling a SWP-amenable loop candidate ? No Yes IGLS Inner loop processing software pipelining GRA/LRA Failure/not profitable IGLS Code Emission Success \course\cpeg421-10s\Topic2a.ppt

  27. Code Generation Intermediate Representation (CGIR) • TOPs (Target Operations) are “quads” • Operands/results are TNs • Basic block nodes in control flow graph • Load/store architecture • Supports predication • Flags on TOPs (copy ops, integer add, load, etc.) • Flags on operands (TNs) Open64 Tutorial - An Introduction

  28. From WHIRL to CGIR Cont’d • Information passed • alias information • loop information • symbol table and maps Open64 Tutorial - An Introduction

  29. The Target Information Table (TARG_INFO) • Objective: • Parameterized description of a target machine and system architecture • Separates architecture details from the compiler’s algorithms • Minimizes compiler changes when targeting a new architecture Open64 Tutorial - An Introduction

  30. WHIRL SSA: A New Optimization Infrastructure for Open64 Parallel Processing Institute, Fudan University, Shanghai, China Global Delivery China Center, Hewlett-Packard, Shanghai, China Open64 Tutorial - An Introduction

  31. Goal • A better “DU manager” • Factored UD chain • Reduced traversing overhead • Keeping alias information • Handle both direct and indirect access • Eliminate ‘incomplete DU/UD chain’ • Easy to use • STL-style iterator to traverse the DU/UD chain • A flexible Infrastructure • Available from H WHIRL to L WHIRL • Lightweight, demand-driven • Precise and updatable Open64 Tutorial - An Introduction

  32. PHI Placement • For SCF, φ nodes are mapped on the root WN • For GOTO-LABEL, φ nodes are placed on the LABEL Open64 Tutorial - An Introduction 4/1/2014 33

  33. Thank you! Open64 Tutorial - An Introduction

More Related