1 / 12

Performance – Last Lecture

Performance – Last Lecture. Bottom line performance measure is time Performance A = 1/Execution Time A Comparing Performance N = Performance A / Performance B. Example.

ramla
Télécharger la présentation

Performance – Last Lecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Performance – Last Lecture • Bottom line performance measure is time • PerformanceA = 1/Execution TimeA • Comparing Performance • N = PerformanceA / PerformanceB

  2. Example • If a machine A runs a program 25 seconds and machine B runs the same program in 20 seconds, how much faster is machine B that machine A?

  3. Metrics of performance Answers per month Useful Operations per second Application Programming Language Compiler (millions) of Instructions per second – MIPS (millions) of (F.P.) operations per second – MFLOP/s ISA Datapath Megabytes per second Control Function Units Cycles per second (clock rate) Transistors Wires Pins Each metric has a place and a purpose, and each can be misused

  4. Relating Metrics • Instead of reporting execution time in seconds, we often use cycles • So, to improve performance (everything else being equal) you can either________ the # of required cycles for a program, or________ the clock cycle time or, said another way, ________ the clock rate.

  5. Example • Our favourite program runs in 10 seconds on computer A, which has a400 Mhz. clock. We are trying to help a computer designer build a new machine B, that will run this program in 6 seconds. The designer can use new (or perhaps more expensive) technology to substantially increase the clock rate, but has informed us that this increase will affect the rest of the CPU design, causing machine B to require 1.2 times as many clock cycles as machine A for the same program. What clock rate should we tell the designer to target?"

  6. CPU time = Seconds = Instructions x Cycles x Seconds Program Program Instruction Cycle • Following formula relates most basic metrics to CPU time: • How do we measure these metrics?

  7. Measuring CPI • Often obtained by a detailed simulation of an architecture or by using CPU counters. • Sometimes know clock cycle count for different instruction types:

  8. Amdahl's Law Execution Time After Improvement = Execution Time Unaffected +( Execution Time Affected / Amount of Improvement ) • Example: "Suppose a program runs in 100 seconds on a machine,withmultiply responsible for 80 seconds of this time. How much do we have to improve the speed of multiplication if we want the program to run 4 times faster?" How about making it 5 times faster? • Principle: Make the common case fast

  9. Example • Suppose we enhance a machine making all floating-point instructions run five times faster. If the execution time of some benchmark before the floating-point enhancement is 10 seconds, what will the speedup be if half of the 10 seconds is spent executing floating-point instructions? • We are looking for a benchmark to show off the new floating-point unit described above, and want the overall benchmark to show a speedup of 3. One benchmark we are considering runs for 100 seconds with the old floating-point hardware. How much of the execution time would floating-point instructions have to account for in this program in order to yield our desired speedup on this benchmark?

  10. Remember • Performance is specific to a particular program/s • Total execution time is a consistent summary of performance • For a given architecture performance increases come from: • increases in clock rate (without adverse CPI affects) • improvements in processor organization that lower CPI • compiler enhancements that lower CPI and/or instruction count • Pitfall: expecting improvement in one aspect of a machine’s performance to affect the total performance

  11. Evaluating Performance • Real workloads good but often not possible • Mostly use benchmarks • Real applications best • SPEC benchmarks – CPU92, CPU95, CPU2000 • SPECweb99 • Guiding principle to reporting performance is “reproducibility” • i.e. detailing: • CPU details • Software (e.g. compiler version etc)

  12. Example • /proc/cpuifo processor : 0 vendor_id : GenuineIntel type : primary processor cpu family : 6 model : 9 model name : Intel(R) Pentium(R) M processor 1600MHz stepping : 5 brand id : 6 cpu count : 0 apic id : 0 cpu MHz : 599 fpu : yes flags : fpu vme de pse tsc msr mce cx8 sep mtrr pge mca cmov pat clfl dtes acpi mmx fxsr sse sse2 tmi pbe tm2 est

More Related