380 likes | 494 Vues
KTAU (Kernel TAU) presents a flexible and extensible framework for kernel profiling, designed to enhance performance analysis in high-performance computing environments. This innovative profiling infrastructure aims to achieve minimal overhead with lightweight instrumentation techniques, enabling both inclusive and exclusive time measurement. Its modular architecture separates concerns into distinct phases, facilitating easy integration with the TAU profiler. With a focus on context-of-execution profiling and dynamic function identification, KTAU supports both user-space and kernel-space profiling seamlessly.
E N D
KTAU: Kernel TAU Aroon Nataraj Suravee Suthikulpanit
Agenda • KTAU Infrastructure • KTAU - TAU Integration • KTAU Source Distribution • Generic Kernel Profiling API (BG/L compute-node)
KTAU Infrastructure • Modular Design: Goal: Separate into 3 phases: Source Instr. -- KTAU -- User Interface KTAU Kernel Instrumentation KTAU Profiling Infrastructure KTAU Proc Interface KTAU User-API
Motivations • Minimal addition to existing kernel source • Instrumentation-based profiling • Inclusive / Exclusive time • High resolution time (rdtsc) • Flexible and extensible design • Light-weight and Efficient • Configurable
Inst. Point Design • GOAL: Minimal overhead (perturbation) • Address-Symbol mapping (System.map, /proc/kallsyms) • Run-time function profile registration (Dynamic function ID mapping) • Types • Timer (start, stop) • Event Counter, Miscellaneous Counter
Dynamic function ID • GOAL: Fast access to profile data • Each function has a unique ID used for the indexing the profile table • Register itself at run-time • ID: - 0-299 : System Calls - 300-400 : Static Index - 400-1023 : Dynamic Index
Inst. Placement • Inside kernel routine • Randezvous entry/exit points • System calls entry/exit (/linux/arch/i386/entry.S) • bottom-half entry/exit routine (/linux/kernel/softirq.c)
ENTRY(system_call) pushl %eax SAVE_ALL GET_THREAD_INFO(%ebp) testb $(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT), TI_flags(%ebp) jnz syscall_trace_entry cmpl $(nr_syscalls), %eax jae syscall_badsys syscall_call: #ifdef CONFIG_KTAU_SYSCALL <START_KTAU_PROF CODE > #endif /* CONFIG_KTAU */ call *sys_call_table(,%eax,4) #ifdef CONFIG_KTAU_SYSCALL <STOP_KTAU_PROF CODE > #endif /* CONFIG_KTAU */ movl %eax,EAX(%esp) syscall_exit: cli movl TI_flags(%ebp), %ecx testw $_TIF_ALLWORK_MASK, %cx jne syscall_exit_work restore_all: RESTORE_ALL #ifdef CONFIG_KTAU #include <linux/ktau/ktau_inst.h> #endif /*CONFIG_KTAU*/ Int foo( ){ #ifdef CONFIG_KTAU GET_KTAU_INDEX(); ktau_start_timer(&foo); #endif /*CONFIG_KTAU*/ ………………………. ………………………. ………………………. #ifdef CONFIG_KTAU ktau_stop_timer(&foo); #endif /*CONFIG_KTAU*/ }
Scheduling Kernel Control Flow
Motivations • Context-Of-Execution-based profiling • Easily extensible w/ different type of inst. • Small Memory Footprint • No running daemon • SMP-Support
COE-based profiling • Maintain profile data in task_struct which allows task level profiling as well as whole system • Consistant with TAU framework • No lookup is required
Requirements • Bi-Directional Communication • Extensible, allowing different commands • Scales with load • Able to support per-process & System profile • As independent of KTAU as possible
KTAU KTAU - PROC KTAU - API Control - API SYS_IOCTL IOCTL on /proc/ktau Lib-KTAU Profiled Apps Utilities High-Level Design / Architecture Kernel Space Exposes /proc/ktau UserSpace
Motivations • Providing integration of kernel and user profile • Interface with TAU Profiler through libKtau • Use Paraprof as visualization tool
TAU configuration ./configure -cc=icc -c++=icpc -ktau -ktau_merge -ktauinc=/usr/src/linux-2.6.10-ktau-1.2/include/ -fortran=intel -mpi -mpiinc=/software/mpich2-install/include/ -mpilib=/software/mpich2-install/lib/
/proc/kallsyms c01589c0 T sys_open c0158a50 T sys_creat c0158a80 T filp_close c0158b10 T sys_close c0158ba0 T sys_vhangup c0158be0 T generic_file_open c0158c50 T nonseekable_open c0158c62 t .text.lock.open c0158d20 T generic_file_llseek c0158e10 T remote_llseek c0158f20 T no_llseek c0158f30 T default_llseek c0159000 T vfs_llseek c0159070 T sys_lseek c0159110 T sys_llseek c01591e0 T do_sync_read c01592d0 T vfs_read c0159400 T do_sync_write c01594f0 T vfs_write c0159620 T sys_read c01596a0 T sys_write c0159720 T sys_pread64 c01597b0 T sys_pwrite64 c0159840 T iov_shorten c0159880 t do_readv_writev c0159b00 T vfs_readv c0159b70 T vfs_writev c0159be0 T sys_readv c0159c60 T sys_writev Interface Profiler::Start Profiler::StoreData TauKtau.h public: TauKtau::StartKProfile(); TauKtau::StopKProfile(); private: TauKtau::MapKallsyms(); TauKtau::ReadKallsyms(); TauKtau::DiffKProfile(); TauKtau::DumpKProfile(); Ktau_proc_interface.h read_size(); read_data(); purge_data(); unpack_bindata(); /proc/ktau/read
Output Format • Kernel Profile (User and Interrupt Context) • System-wide • Individual process • Merged Kernel-User Profile (User Context)
Source Distribution • KTAU suite • Kernel patches • Kernel extension source • User-space library/tools • Documentations
Generic Kernel ProfilingAPI Suite BG/L Compute-node
Motivations Goal: Providing a generic API for kernel profiling on BG/L compute-node BG/L Key Characteristics: • Common address space • Single Process
Strategies • Dynamic ID mapping to kernel routines ( Based on KTAU prototype ) • Dynamic registration of profiling handler routines
Instrumentation API • Provides kernel intercept points • Kernel routine taxonomy • Dynamic function ID mapping to routine
Instrumentation API (cont’d) unsigned int get_fid ( unsigned int group ) void start_timer ( unsigned int funcId , unsigned int group, char* funcName) void stop_timer ( unsigned int funcId , unsigned int group, char* funcName) void start_prof ( unsigned int funcId , unsigned int group, char* funcName,unsigned int data ) void stop_prof ( unsigned int funcId , unsigned int group, char* funcName, unsigned int data ) void event_prof ( unsigned int funcId , unsigned group, char* funcName, unsigned int data)
Registration API • Profiling tools selection - Combination of tools according to the tool strength • Group selection - Focus on the group of kernel routines of interest at a particular section of application
Registration API (cont’d) int register_handler ( unsigned int group_mask, void ( *handler ) ( unsigned int, unsigned int, char*, void* ) ) int unregister_handler ( unsigned int group_mask )
Handler API • Tool specific routines for processing data void handler ( unsigned int funcId, unsigned int type, char * funcName, void * data )
Trace Merged Profile