460 likes | 566 Vues
This thesis seminar presents the use of supervised learning in optimizing compilers through the innovative LOCOTM methodology, which automates heuristic induction for efficient optimization decisions, beneficial for complex computer architectures. The study showcases specific applications in hybrid register allocation and instruction scheduling, demonstrating significant improvements in performance and cost reduction. The methodology's success lies in its ability to generate training data, induce heuristics, and effectively integrate them into the compiler, paving the way for advanced optimization strategies.
E N D
John Cavazos Architecture and Language Implementation Lab Thesis Seminar University of Massachusetts, Amherst Learning for Optimizing Compilers
Compiler writers have a difficult task optimizations are NP-hard computer architectures are complex computer architects need rapid evaluation Generating heuristics manually is slow, complicated, and ad hoc. Motivation
Propose Supervised Learning • Induces heuristics automatically • Training examples • a,b,c,…,z label • a,b,c,…z : properties of problem • label : proper decision to make • Two objectives: • Minimize error • Prefer less complicated function • LOCO (Learning for Optimizing COmpilers)
Benefits of Supervised Learning • Heuristic construction sped up • Determines relative importance of features • Effective heuristics • Comparable to hand-tuned heuristics • Theoretically sound • Traditional approach ad hoc
What Order to Apply Optimizations Phase-ordering heuristics When to Optimize Filters Which Optimization Algorithm to Apply Hybrid Optimizations How to Optimize Priority Functions Taxonomy of Compiler Heuristics
The LOCO Methodology • Determine class of heuristic • Generate raw data • Instrument compiler • Process raw data • Thresholds • Generates training data • Induce heuristic • Integrate into compiler
The LOCO Methodology LOCO Training Set Instrumented Compiler Supervised Learning Production Compiler Generate raw learning data Ruleinduction Processrawdata (Thresholding) Inducesheuristic
Experimental Setup • Java JIT compiler • Jikes RVM 2.0.2 • PowerPC 533 MHz G4, model 7410 • Case Study 1: SPEC JVM benchmarks • Case Study 2: Scientific benchmarks • Scheduling improves by 4% or more
Case Study 1 Hybrid Register Allocation
Motivation • Register Allocation: important • Effective use of registers • Different Algorithms to choose from • Graph coloring: possibly expensive • Linear scan: not always effective • Which algorithm to apply?
Solution • Features predict which algorithm to use • Heuristic function controls allocator • Reduces cost significantly • Retains most benefit • Successful with simple features • Applicable to other optimizations
Inducing Heuristic Controller • For each block generate raw training data • Features of method • Additional spills incurred • Cost of allocation algorithms • Process raw data to generate training set • Leave-one-out cross-validation • Output of LOCO = heuristic controller
Labeling Training Instances • Two factors: • Cost of register allocation • Spill benefit of different allocators • Prefer graph coloring • If benefit above threshold • Prefer linear scan • If graph coloring cost above threshold • No spill benefit
Motivation for Threshold Technique • Noise reduction technique • Simplifies learning • Removes cases of fine distinction • Separation by a threshold gap • For example: • T=10% model estimates improvement by 10%
Thresholding Linear Scan Graph Coloring No Instance Spill Threshold(8192) Cost Threshold (0.5)
Labeling Training Instances If (LS_Spill – GC_Spill > Spill_Threshold) Print “GC”; Else If (LS_Cost/GC_Cost > Cost_Threshold) Print “LS”; Else if (LS_Spill – GC_Spill <= 0) Print “LS”; Else { // No Label } High Spill Benefit High Cost No Spill Benefit Skip Training Instance
Significantly reduce register allocation time Reduced allocation time by 60% Preserve benefit of graph coloring Achieved 93% of graph coloring benefit LOCO effective for this heuristic Hybrid Register Allocation is Successful
Case Study 2: Instruction Scheduling Filters
Motivation • Instruction scheduling: important • Improvements over 15% • But: • Expensive • Frequently not beneficial • Problem: Can we predict which blocks benefitfrom scheduling?
Solution • Features of block predict when to schedule • Heuristic controls scheduling • Reduces cost of scheduling • Retains benefit of scheduling • Successful with simple features • Filter for applying scheduler
Construct cheap-to-compute features of a block Obtain training instances that include: Features of the block Labels (Scheduling benefit to block) Induce a filter using LOCO We used rule induction Use the filter to control when compiler schedules Inducing a Filter
Block Timing Estimator • Estimate of cycles to execute block • Simple model of real machine • Determines cost of block in isolation • Relative cycle differences important • Not absolute cycle counts
Significantly reduce scheduling time Reduced scheduling time by 75% Preserve benefit of scheduling Achieved 93% of scheduling benefit LOCO effective for this heuristic Filters are Successful
Supervised learning Loop-unrolling and tiling Genetic algorithms Hyperblocks, reg allocation, prefetching (MIT) Application-specific compilation strategy (Rice) Reinforcement learning Used to induce heuristic for scheduling (UMass) We argue LOCO is better Related Work
More work on filters Inlining and SSA-based opts More work on hybrid optimizations Garbage collection More work on priority functions Register allocation spill heuristic Use LOCO anywhere a heuristic is used Future Work
LOCO effective at constructing heuristics Faster than most alternatives LOCO can lead to insights More readable than other alternatives LOCO heuristics competitive Comparable to hand-tuned heuristics LOCO easier to use Conclusion