270 likes | 502 Vues
Lecture 2a: Performance Measurement. Performance Evaluation. The primary duty of software developers is to create functionally correct programs Performance evaluation is a part of software development for well-performing programs. Performance Analysis Cycle.
E N D
Performance Evaluation • The primary duty of software developers is to create functionally correct programs • Performance evaluation is a part of software development for well-performing programs
Performance Analysis Cycle • Have an optimization phase just like testing and debugging phase Code Development Functionally complete and correct program Measure Analyze Modify / Tune Complete, correct and well-performing program Usage
Goals of Performance Analysis The goal of performance analysis is to provide quantitative information about the performance of a computer system
Goals of Performance Analysis • Compare alternatives • When purchasing a new computer system, to provide quantitative information • Determine the impact of a feature • In designing a new system or upgrading, to provide before-and-after comparison • System tuning • To find the best parameters that produce the best overall performance • Identify relative performance • To quantify the performance relative to previous generations • Performance debugging • To identify the performance problems and correct them • Set expectations • To determine the expected capabilities of the next generation
Performance Evaluation Performance Evaluation steps: • Measurement / Prediction • What to measure? How to measure? • Modeling for prediction • Simulation • Analytical Modeling • Analysis & Reporting • Performance metrics
Performance Measurement Interval Timers Hardware Timers Software Timers
Performance Measurement Hardware Timers Counter value is read from a memory location Time is calculated as Tc Clock Counter n bits to processor memory bus Time = (x2 - x1) x Tc
Performance Measurement Software Timers Interrupt-based When interrupt occurs, interrupt-service routine increments the timer value which is read by a program Time is calculated as Tc Clock Prescaling Counter T’c to processor interrupt input Time = (x2 - x1) x T’c
Performance Measurement Timer Rollover Occurs when an n-bit counter undergoes a transition from its maximum value 2n – 1 to zero There is a trade-off between roll over time and accuracy
Timers Solution: Use 64-bit integer (over half a million year) Timer returns two values: One represents seconds One represents microseconds since the last second With 32-bit, the roll over is over 100 years
Performance Measurement Interval Timers T0 Read current time Event being timed (); T1 Read current time Time for the event is: T1-T0
Performance Measurement Timer Overhead Initiate read_time Current time is read Event begins Event ends; Initiate read_time Current time is read Measured time: Tm = T2 + T3 + T4 Desired measurement: Te = Tm – (T2 + T4) = Tm – (T1 + T2) since T1 = T4 Timer overhead: Tovhd= T1 + T2 Te should be 100-1000 times greater than Tovhd . T1 T2 T3 T4
Performance Measurement Timer Resolution Resolution is the smallest change that can be detected by an interval timer. nT’c< Te < (n+1)T’c If Tcis large relative to the event being measured, it may be impossible to measure the duration of the event.
Performance Measurement Measuring Short Intervals Te < Tc Tc 1 Te Tc 0 Te
Performance Measurement Measuring Short Intervals Solution:Repeat measurements n times. Average execution time: T’e= (m x Tc) / n m: number of 1s measured Average execution time: T’e= (Tt / n ) – h Tt: total execution time of n repetitions h: repetition overhead Tc Te Tt
Performance Measurement Time Elapsed time / wall-clock time / response time Latency to complete a task, including disk access, memory access, I/O, operating system overhead, and everything (includes time consumed by other programs in a time-sharing system) CPU time The time CPU is computing, not including I/O time or waiting time User time / user CPU time CPU time spent in the program System time / system CPU time CPU time spent in the operating system performing tasks requested by the program
Performance Measurement UNIX time command 90.7u 12.9s 2:39 65% Drawbacks: Resolution is in milliseconds Different sections of the code can not be timed User time Elapsed time Percentage of elapsed time System time
Timers Timer is a function, subroutine or program that can be used to return the amount of time spent in a section of code. zero = 0.0; t0 = timer(&zero); … < code segment > … t1 = timer(&t0); time = t1; t0 = timer(); … < code segment > … t1 = timer(); time = t1 – t0;
Timers Read Wadleigh, Crawford pg 130-136 for: time, clock, gettimeofday, etc.
Timers Measuring Timer Resolution main() { . . . zero = 0.0; t0 = timer(&zero); t1 = 0.0; j=0; while (t1 == 0.0) { j++; zero=0.0; t0 = timer(&zero); foo(j); t1 = timer(&t0); } printf (“It took %d iterations for a nonzero time\n”, j); if (j==1) printf (“timer resolution <= %13.7f seconds\n”, t1); else printf (“timer resolution is %13.7f seconds\n”, t1); } foo(n){ . . . i=0; for (j=0; j<n; j++) i++; return(i); }
Timers Measuring Timer Resolution Using clock(): Using times(): Using getrusage(): It took 682 iterations for a nonzero time timer resolution is 0.0200000 seconds It took 720 iterations for a nonzero time timer resolution is 0.0200000 seconds It took 7374 iterations for a nonzero time timer resolution is 0.0002700 seconds
Timers Spin Loops For codes that take less time to run than the resolution of the timer First call to a function may require an inordinate amount of time. Therefore the minimum of all times may be desired. main() { . . . zero = 0.0; t2 = 100000.0; for (j=0; j<n; j++) { t0 = timer(&zero); foo(j); t1 = timer(&t0); t2 = min(t2, t1); } t2 = t2 / n; printf (“Minimum time is %13.7f seconds\n”, t2); } foo(n){ . . . < code segment > }
Profilers A profiler automatically insert timing calls into applications to generate calls into applications It is used to identify the portions of the program that consumes the largest fraction of the total execution time. It may also be used to find system-level bottlenecks in a multitasking system. Profilers may alter the timing of a program’s execution
Profilers Data collection techniques Sampling-based This type of profilers use a predefined clock; every multiple of this clock tick the program is interrupted and the state information is recorded. They give the statistical profile of the program behavior. They may miss some important events. Event-based Events are defined (e.g. entry into a subroutine) and data about these events are collected. The collected information shows the exact execution frequencies. It has substantial amount of run-time overhead and memory requirement. Information kept Trace-based: The compiler keeps all information it collects. Reductionist: Only statistical information is collected.
Performance Evaluation Performance Evaluation steps: • Measurement / Prediction • What to measure? How to measure? • Modeling for prediction • Simulation • Analytical Modeling • Queuing Theory • Analysis & Reporting • Performance metrics
Predicting Performance Performance of simple kernels can be predicted to a high degree Theoretical performance and peak performance must be close It is preferred that the measured performance is over 80% of the theoretical peak performance