1 / 31

Correcting the Dynamic Call Graph Using Control Flow Constraints

Correcting the Dynamic Call Graph Using Control Flow Constraints. Byeongcheol (BK) Lee Kevin Resnick Michael Bond Kathryn McKinley UT Austin. b. w 1. c. a. w 2. Dynamic call graph. Motivation. Complexity of large object oriented programs Decompose the program into small methods

jude
Télécharger la présentation

Correcting the Dynamic Call Graph Using Control Flow Constraints

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Correcting the Dynamic Call Graph Using Control Flow Constraints Byeongcheol (BK) Lee Kevin Resnick Michael Bond Kathryn McKinley UT Austin

  2. b w1 c a w2 Dynamic call graph Motivation • Complexity of large object oriented programs • Decompose the program into small methods • Method boundary becomes performance-bottleneck • Dynamic interprocedural optimization • Solve the method boundary problem • Inlining and specialization vary the performance by factor of 2 • Dynamic call graph (DCG) is critical input!

  3. b 1,000 call b c a 500 call c DCGSample Error method a Inaccurate call graph

  4. Timer-based sampling and timing bias Call stack … … c b c b c b c c c b c c c a a t

  5. Timer-based sampling and timing bias Call stack … … c b c b c b c c c b c c c a a t

  6. Timer-based sampling and timing bias Call stack … … c b c b c b c c c b c c c a a t

  7. Timer-based sampling and timing bias Call stack … … c b c b c b c c c b c c c a a t

  8. timer tick timer tick timer tick timer tick DCGSample b … … b b b 9991000 11 910 1011 c a c c c a a a 500 5 5 56 Timer-based sampling and timing bias Call stack … … c b c b c b c c c b c c c a a t

  9. Full instrumentation Arnold-Grove sampling[2005] Timer-based sampling [2000] Correction [2007] Overhead and accuracyin call graph profiling 25 20 Overhead (%) 15 10 5 0 100 40 60 80 Accuracy (%)

  10. Outline • Motivation • Call graph correction • Evaluation

  11. Sampling Timing bias in SPEC JVM98 raytrace Normalized frequency(%) Method calls grouped by source method

  12. Timing bias in SPEC JVM98 raytrace Normalized frequency(%) Method calls grouped by source method

  13. Correction algorithms • Detect and correct DCG error • DCG constraint • Static and dynamic approaches • Static FDOM (Frequency dominator) correction • Static approach • Uses static FDOM constraint on DCG • Dynamic basic block profile correction • Dynamic approach • Uses dynamic basic block profile constraint on DCG New

  14. a b a c Static FDOM constraint • FDOM constraint on CFG • call cis executed at least as many times ascall b • call cFDOMcall b • FDOM constraint on DCG • f( ) ≥f( ) call b call c method a

  15. a a b c Static FDOM correction FDOM constraint: f( ) ≥ f( ) b b Correction 750 1,000 c c a a 500 750 DCGFDOMCorrection DCGSample • Detect error and assign the same average frequency • One possible solution to the FDOM constraint • Preserve total frequency sum

  16. a a c b Dynamic basic block profile constraint • Some dynamic optimization systems do edge profiling • Baseline compiler in Jikes RVM • Dynamic basic block profile constraint on CFG • f(call c) = 2 * f(call b) • Dynamic basic block profile constraint on DCG • f( ) = 2 * f( ) 50% 50% call b call c method a

  17. a a a a c b c b Dynamic basic block profile correction Constraint:f( ) = 2* f( ) b b Correction 1,000 500 c c a a 1,000 500 DCGEdgeProfileCorrection DCGSample fNew( ) = 1/(1+2) * (1,000+500) = 500 fNew( ) = 2/(1+2) * (1,000+500) = 1,000

  18. Static FDOM correction Dynamic basic block profile correction Best result: raytrace Sampling

  19. Outline • Motivation • Call graph correction • Evaluation

  20. Experimental methodology • Jikes RVM 2.4.5 on 3.2G Pentium 4 • Replay methodology [Blackburn et al. ‘06] • Deterministic run • 1st iteration – compilation + application run • 2nd iteration – application run • Measurement • Accuracy • Use overlap accuracy [Arnold & Grove ’05] • Overhead • 1st iteration includes call graph correction • Performance • 2nd iteration is application-only • SPECJVM98 and DaCapo benchmarks

  21. Accuracy

  22. Overhead

  23. Inlining performance Baseline: profile-guided inlining with default call graph sampling

  24. Summary • CFG constraint improves the DCG • Inlining has been tuned for bad call graph • Advantages • Can be easily combined with other DCG profiling • Minimal overhead only during the compilation • Future work • More inter-procedural optimizations with high accuracy DCG

  25. Question and comment • Thank you!

  26. Timing bias misleads optimizer • DCGSample • Edge frequencies were reversed! • Inlining decision • Inliner may inline b instead of c b b 5,000times 1,000samples Sampling with timing bias c a c a 10,000 times 500samples DCGPerfect DCGSample

  27. Compile & instrument Machine code Dynamic call graph Call graph profiling in online optimization system • Profiling and program run at the same time • Minimize profiling overhead • Corollary: sacrifice profiling accuracy Sourceprograme.g. Java byte code Online optimization system

More Related