1 / 22

Efficient Program Compilation through Machine Learning Techniques

Efficient Program Compilation through Machine Learning Techniques. Gennady Pekhimenko IBM Canada Angela Demke Brown University of Toronto. Motivation. My cool program . Compiler -O2. Unroll. Executable. Unroll. Inline. Inline. Peephole. Peephole. few seconds. DCE. DCE. Unroll.

jake
Télécharger la présentation

Efficient Program Compilation through Machine Learning Techniques

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Efficient Program Compilation through Machine Learning Techniques Gennady Pekhimenko IBM Canada Angela Demke Brown University of Toronto

  2. Motivation My cool program Compiler -O2 Unroll Executable Unroll Inline Inline Peephole Peephole few seconds DCE DCE Unroll But what to do if executable is slow? Unroll Unroll Unroll Unroll Replace –O2 with –O5 Unroll Optimization 100 New Fast Executable 1-10 minutes

  3. Motivation (2) Compiler -O2 Executable 1 hour Too slow! Our cool Operating System Compiler -O5 New Executable 20 hours We do not have that much time Why did it happen?

  4. Basic Idea Unroll Do we need all these optimizations for every function? Unroll Unroll Optimization 100 Probably not. Compiler writers can typically solve this problem, but how ? Description of every function Classification based on the description Only certain optimizations for every class Machine Learning is good for solving this kind of problems

  5. Overview • Motivation • System Overview • Experiments and Results • Related Work • Conclusions • Future Work

  6. Initial Experiment 3X difference on average

  7. Initial Experiment (2)

  8. Our System Gather Training Data • Prepare • extract features • modify heuristic values • choose transformations • find hot methods Compile Measure run time Best feature settings Online Offline Deploy TPO/XL Compiler set heuristic values Learn Logistic Regression Classifier Classification parameters

  9. Data Preparation • Existing XL compiler is missing functionality • Extension was made to the existing Heuristic Context approach Three key elements: • Feature extraction • Heuristic values modification • Target set of transformations • Total # of insts • Loop nest level • # and % of Loads, Stores, Branches • Loop characteristics • Float and Integer # and % • Unroll • Wandwaving • If-conversion • Unswitching • CSE • Index Splitting ….

  10. Gather Training Data • Try to “cut” transformation backwards (from last to first) • If run time not worse than before, transformation can be skipped • Otherwise we keep it • We do this for every hot function of every test The main benefit is linear complexity. Wandwaving Late Inlining Unroll

  11. Learn with Logistic Regression Function Descriptions Classifier Input Best Heuristic Values • Logistic Regression • Neural Networks • Genetic Programming Compiler + Heuristic Values .hpredict files Output

  12. Deployment Online phase, for every function: • Calculate the feature vector • Compute the prediction • Use this prediction as heuristic context Overhead is negligible

  13. Overview • Motivation • System Overview • Experiments and Results • Related Work • Conclusions • Future Work

  14. Experiments Benchmarks: SPEC2000 Others from IBM customers Platform: IBM server, 4x Power5 1.9 GHz, 32GB RAM Running AIX 5.3

  15. Results: compilation time 2x average speedup

  16. Results: execution time

  17. New benchmarks: compilation time

  18. New benchmarks: execution time 4% speedup

  19. Overview • Motivation • System Overview • Experiments and Results • Related Work • Conclusions • Future Work

  20. Related Work • Iterative Compilation • Pan and Eigenmann • Agakov, et al. • Single Heuristic Tuning • Calder, et al. • Stephenson, et al. • Multiple Heuristic Tuning • Cavazos, et al. • MILEPOST GCC

  21. Conclusions and Future Work • 2x average compile time decrease • Future work • Execution time improvement • -O5 level • Performance Counters for better method description • Other benefits • Heuristic Context Infrastructure • Bug Finding

  22. Thank you • Raul Silvera, Arie Tal, Greg Steffan, Mathew Zaleski • Questions?

More Related