Enhancing the Role of Inlining in Effective Interprocedural Parallelization

Enhancing the Role of Inlining in Effective Interprocedural Parallelization Jichi Guo, Mike Stiles Qing Yi, KleanthisPsarris

Problem • Inter-procedural parallelization • Parallel after inlining • Gain more parallelizable loops • Lost of parallelized loops • Inlining messes up caller / callee • Missed parallel opportunities • Inlining increases code complexity

Goal • Keep the gain parallelizable loops • Prevent the lost parallelism • Discover the missed opportunities

Solution • Summarize the code using annotation • Express the underlying information • Inline the annotation before parallelization • Pass the summarized information to the compiler • Reverse-inline after parallelization • Revert inlining side effects • Maintain equivalence

Outline • Innovations • Problems of parallel + inline strategy • Annotation language • Annotation-based inlining technique • Experiments • Summary

Problems of parallel + inlining • Parallel + inlining • Conventional inlining with heuristics and pre-transformations • Heuristics: code size • Transformations: linearization, forward substitution • Intra-procedural loop parallelization • Fortran do-all loop • Goal • Gain loops in caller • Problems • Lost loops in caller / callee • Missed loops in caller

Problems of parallel + inlining • Lost of parallelizable loops in caller/callee • Transformations that cause the lost • Forward substitution • Linearization • Forward substitution of non-linear subscripts • Create indirect array references • Linearization of array dimensions • Mess up array shapes

Problems of parallel + inlining • Forward substitution of non-linear subscripts • Create indirect array references X2(I) ⇒ T(IX(7) + I) Y2(I) ⇒ T(IX(8) + I) Z2(I) ⇒ T(IX(9) + I)

Problems of parallel + inlining • Linearization of array dimensions • Mess up array shapes PP(i, j, k) ⇒ PP(i + j*4 + k*16)

Problems of parallel + inlining • Missed parallelizable loops in caller • Coding styles that cause the lost • Opaque compositional subroutines • A calls B, B calls C, C calls D, … • Array access • When it is difficult to determine which part is killed • Debugging and Error Checking • Statement that breaks the dependency is never executed • I/O statements • Indirect array references • ID=IDX(I), X = A(ID)

Problems of parallel + inlining • Opaque compositional subroutines • A calls B, B calls C, C calls D, …

Problems of parallel + inlining • Array access • Difficult to determine which part is killed CTR computed at runtime

Problems of parallel + inlining • Debugging and Error Checking • Statement that breaks the dependency is never executed • I/O statements

Problems of parallel + inlining • Indirect array references IN=>NODE NODE=>IREL IREL=>RHSB

The annotation language • Goal • Summarize information • Avoid ambiguity

The annotation language • Restricted grammar • Special operators • Writing annotations

The annotation language • Restricted grammar • Do-all loop only • No goto

The annotation language • Special operators y = operator(x1, x2, …, xn) Purpose: abstract relation • Unknown operator • Relation is unknown • Generic functions • Unique operator • Relation is one-to-one, from X to Y

The annotation language • Writing annotations • Eliminating adverse side effects • Preserve caller and callee if inlining breaks the dependency • Summarize opaque subroutines • Eliminate nested function calls • Array access • Specify exact range get read/modified • Debugging and error handling • Aggressive strategy: ignore checking statements • Indirect array references • Discover unique relation

The annotation language • Summarize opaque subroutines • Eliminate nested function calls

The annotation language • Array access • Specify exact range get read/modified

The annotation language • Debugging and error handling • Aggressive strategy: ignore checking statements

The annotation language • Indirect array references • Discover unique relation

Annotation-based inlining • Goal • Pass annotated information to the compiler • Eliminate inlining side effects • Flow • Inline before parallelization • Reverse-inlining after parallelization • Verify and evaluate at last • Implementation • POLARIS compiler for parallelization • ROSE compiler for parsing • POET transformer • PERFECT benchmark

Annotation-based inlining • Workflow • Annotation inlining⇒ Parallelization ⇒ Reverse-inlining

Annotation-based inlining • Inlining annotation • Steps • Annotation ⇒ source language • Translating special operators • Inlinining generated source language • Avoiding linearization • Translating special operators • Unknown: using uninitialized global arrays • Unique: using linear expression • Avoiding linearization

Annotation-based inlining • Inlining annotation

Annotation-based inlining • Parallelize do-all loops

Annotation-based inlining • Reverse inlining

Annotation-based inlining • Reverse inlining is indispensible • Inlinining is restored to function call • Avoid lost of parallelism in caller / callee • Enable abstraction operators (unknown, unique)

Annotation-based inlining • Verification and evaluation • Correctness, Efficiency, and Generality

Experiment • Purpose • What does conventional lining bring to parallelization • Gain? • Lost? • Missed? • How good is annotation-based inlining to avoid above issues • Design • PERFECT benchmarks (except SPEC77) • Two machines • 8 cores Intel Mac • 4 cores AMD Operon • End compiler • GFortran 4.2.1 • IFort 11.1 • Result • Count of Loops • Performance

Experiment • Result: Loops • Conventional inlining • Having loss • Annotation-based inlining • No loss, more gain

Experiment • Result: Performance • Average speedup limited • Annot-based inlining always better

Summary • Inter-procedural parallelization • Summarize effects of conventional inlining • Gain • Lost • Missed • Propose annotation-based inlining • Annotation summary • Enhanced inlining strategy • Reverse inlining

Thanks! Questions?

Enhancing the Role of Inlining in Effective Interprocedural Parallelization

Enhancing the Role of Inlining in Effective Interprocedural Parallelization

Presentation Transcript

Role of Infrastructure In Enhancing Business Continuity in Banks

Effective Automatic Parallelization of Stencil Computations *

Enhancing the Role of Renewable Energy in California

Hawkeye : Effective Discovery of Dataflow Impediments to Parallelization

The Role of University in Enhancing Agricultural Sector

Vendor´s role in enhancing integrity

Parallelization of FFT in AFNI

Enhancing the Role of the CE in Japan

Enhancing the role of structure in function annotation

Interprocedural Analysis

Role of Marketing in Effective Leadership

The Role of Groundwater in Effective Water Management

Interprocedural Analysis

Enhancing the role of KISA in Innovation Systems

Interprocedural Analysis

Effective Interprocedural Resource Leak Detection ICSE 10

Enhancing the Role and Effective participation of Parliamentarians in the APRM process

Enhancing the Work Placement, the role of SCEPTrE

Role of Copywriters in Effective Marketing

Interprocedural Analysis

Adaptive Inlining

Inlining