230 likes | 391 Vues
Discover IBM Haifa's advanced performance tools, including FDPR-Pro, CodeAnalyzer, and BProber, designed to optimize binary executable files across platforms like AIX, Linux on Power, and upcoming Mac OS X. FDPR-Pro utilizes feedback-directed program restructuring for global optimizations, enhancing program efficiency through sophisticated techniques such as code reordering and data rearrangement. CodeAnalyzer, built as an Eclipse plugin, conducts comprehensive static analysis to identify performance bottlenecks, providing invaluable insights for architects and developers. Join the evolution of performance tooling with IBM.
E N D
Performance Tools developed in IBM Haifa http://www.haifa.il.ibm.com/dept/svt/code_paot.html Gad Haber (haber@il.ibm.com)
HRL Performance Tools • FDPR-Pro • Feedback-based optimizer operating on binary executable files • Part of the AIX 5L • Available on Linux on Power via alphaworks • Under development for • Mac OS X – to be available soon via alphaworks • z/OS • CodeAnalyzer • Eclipse plugin tool for analyzing executable files • Under development • To be added as part of the Performance Work Bench (PerfWB) • BProber • Utility for instrumenting binary executable files • Under development • ESTO • Utility for identifying the optimal set of optimization options • Under development
FDPR-Pro Feedback Directed Program Restructuring
FDPR-Pro - Feedback Directed Program Restructuring • Using a global view of the entire program • Operating on the executable file after linkage • These properties enable FDPR-Pro to do: • Global Code Reordering • Inter Procedure Boundaries Optimizations • Static Data Rearrangement • Constant Area Rearrangement • Data Prefetching • Examples of FDPR-Pro additional optimizations: • Usage of Branch Tables • Usage of TOC load instructions • More..
Method • Phase 1: Code instrumentation • Basic block level • Phase 2: Profile information gathering • Selection of "right" input set (representative workload) • Accumulation over several input sets • Phase 3: Global Code & Data Optimizations • Complements the compiler
FDPR-Pro Optimization Options • -RC Reorder Code • -bf Branch folding • -bp Branch prediction bit setting • -align Code alignment • -nop Eliminate nop instructions • -uce Unreachable code elimination • -hco_resched Hot/Cold instruction scheduling • -RD, -build_dcg Static data reordering • -tocload, -reduce_toc Tocload optimizations • -si, -ipht, -ihf, -isf Aggressive function inlining options • -ptrgl_optimization Optimize function calls via pointers • -dcbt_optimization Inject data prefetching instructions • -link_reg_optimization Eliminate stores/restore of link register • -volatile_regs Eliminate stores/restores using available volatile regs • -killed_regs Eliminate stores/restores of killed registers • -load_after_store Separate between frequent load and store to same address • -loop_unroll Loop unrolling • -stack_opt Reduce stack frame size of Hot functions • -dce Dead code elimination
CodeAnalyzer - Motivation • Architectures are becoming more complex • Using only hardware simulators to detect information about potential performance bottlenecks in a given program is hard • There is a need for performance tools that can statically analyze and visualize programs for a platform design, to be used by: • Hardware architects • Compiler writers • Application developers
CodeAnalyzer • CodeAnalyzer is an eclipse plugin which performs comprehensive static analysis on given executable files and DLLs • Relies on the FDPR-Pro tool for the analysis phase • CodeAnalyzerdisplays the analyzed information together with profiling data collected by: • tprof • FDPR-Pro • The code is then colored according to: • Frequency counters - gathered by FDPR-Pro • Hardware event ticks - gathered by tprof
CodeAnalyzer – (continued) • Provides several views of the input binary • Assembly instructions • Basic blocks • Procedures • CSECT modules • control flow graph • Hot loops • Call graph • Annotated source code • Dispatch group formation • Pipeline slots and functional units
CodeAnalyzer – Performance Comments • Performance comments displayed by CodeAnalyzer • Comments which do not require profiling • Pipeline stalls for the Power architecture • Unreachable code and non-used data • Profile-based comments • Non-variant instructions within Hot loops • Hot function calls proceeded by overwriting non-volatile registers • Hot saves and restores of registers which could be relocated to cold spill areas • Hot instructions that could be scheduled to colder areas in the code • Removable hot branches • Hot direct unconditional branches • Hot direct conditional branches that are taken, which have a colder fallthru • Hot call sites that are appropriate candidates for function inlining • Hot call sites that are appropriate for function specialization • Hot loops that are appropriate for loop unrolling • Hot TOC load instructions that can be replaced by immediate add instructions
PerfWB • CodeAnalyzer is part of the Performance Workbench (PerfWB) utility • PerfWB is a collection of eclipse plugins that provide performance monitoring, tuning and analysis • PerfWB consists of the following eclipse plugins: • ProcMon - system-level monitoring tool for displaying system state and for monitoring running processes and threads • E-Tune- visualizer of feedback information produced by tprof • CodeAnalyzer – performance analyzer of executables and DLLs