Text Book • R. Jain, “Art of Computer Systems Performance Analysis,” Wiley, 1991, ISBN:0471503363(Winner of the “1992 Best Computer Systems Book” Award from Computer Press Association”)
Objectives: What You Will Learn • Specifying performance requirements • Evaluating design alternatives • Comparing two or more systems • Determining the optimal value of a parameter (system tuning) • Finding the performance bottleneck (bottleneck identification) • Characterizing the load on the system (workload characterization) • Determining the number and sizes of components (capacity planning) • Predicting the performance at future loads (forecasting).
Basic Terms • System: Any collection of hardware, software, and firmware • Metrics: Criteria used to evaluate the performance of the system. components. • Workloads: The requests made by the users of the system.
Main Parts of the Course • An Overview of Performance Evaluation • Measurement Techniques and Tools • Experimental Design and Analysis
Measurement Techniques and Tools • Types of Workloads • Popular Benchmarks • The Art of Workload Selection • Workload Characterization Techniques • Monitors • Accounting Logs • Monitoring Distributed Systems • Load Drivers • Capacity Planning • The Art of Data Presentation • Ratio Games
Example • Which type of monitor (software or hardware) would be more suitable for measuring each of the following quantities: • Number of Instructions executed by a processor? • Degree of multiprogramming on a timesharing system? • Response time of packets on a network?
Example • The performance of a system depends on the following three factors: • Garbage collection technique used: G1, G2, or none. • Type of workload: editing, computing, or AI. • Type of CPU: C1, C2, or C3. How many experiments are needed? How does one estimate the performance impact of each factor?
Example • The average response time of a database system is three seconds. During a one-minute observation interval, the idle time on the system was ten seconds. Using a queueing model for the system, determine the following: • System utilization • Average service time per query • Number of queries completed during the observation interval • Average number of jobs in the system • Probability of number of jobs in the system being greater than 10 • 90-percentile response time • 90-percentile waiting time
Common Mistakes in Evaluation • No Goals • No general purpose model • Goals Techniques, Metrics, Workload • Not trivial • Biased Goals • ``To show that OUR system is better than THEIRS'‘ • Analysts = Jury • Unsystematic Approach • Analysis Without Understanding the Problem • Incorrect Performance Metrics • Unrepresentative Workload • Wrong Evaluation Technique
Common Mistakes (Cont) • Overlook Important Parameters • Ignore Significant Factors • Inappropriate Experimental Design • Inappropriate Level of Detail • No Analysis • Erroneous Analysis • No Sensitivity Analysis • Ignoring Errors in Input • Improper Treatment of Outliers • Assuming No Change in the Future • Ignoring Variability • Too Complex Analysis
Common Mistakes (Cont) • Improper Presentation of Results • Ignoring Social Aspects • Omitting Assumptions and Limitations
Checklist for Avoiding Common Mistakes • Is the system correctly defined and the goals clearly stated? • Are the goals stated in an unbiased manner? • Have all the steps of the analysis followed systematically? • Is the problem clearly understood before analyzing it? • Are the performance metrics relevant for this problem? • Is the workload correct for this problem? • Is the evaluation technique appropriate? • Is the list of parameters that affect performance complete? • Have all parameters that affect performance been chosen as factors to be varied?
Checklist (Cont) • Is the experimental design efficient in terms of time and results? • Is the level of detail proper? • Is the measured data presented with analysis and interpretation? • Is the analysis statistically correct? • Has the sensitivity analysis been done? • Would errors in the input cause an insignificant change in the results? • Have the outliers in the input or output been treated properly? • Have the future changes in the system and workload been modeled? • Has the variance of input been taken into account?
Checklist (Cont) • Has the variance of the results been analyzed? • Is the analysis easy to explain? • Is the presentation style suitable for its audience? • Have the results been presented graphically as much as possible? • Are the assumptions and limitations of the analysis clearly documented?
A Systematic Approach to Performance Evaluation • State Goals and Define the System • List Services and Outcomes • Select Metrics • List Parameters • Select Factors to Study • Select Evaluation Technique • Select Workload • Design Experiments • Analyze and Interpret Data • Present Results Repeat
Criteria for Selecting an Evaluation Technique TexPoint fonts used in EMF: AAAAAAA
Three Rules of Validation • Do not trust the results of an analytical model until they have been validated by a simulation model or measurements. • Do not trust the results of a simulation model until they have been validated by analytical modeling or measurements. • Do not trust the results of a measurement until they have been validated by simulation or analytical modeling.
Selecting Metrics • Include: • Performance Time, Rate, Resource • Error rate, probability • Time to failure and duration • Consider including: • Mean and variance • Individual and Global • Selection Criteria: • Low-variability • Non-redundancy • Completeness
Case Study: Two Congestion Control Algorithms • Service: Send packets from specified source to specified destination in order. • Possible outcomes: • Some packets are delivered in order to the correct destination. • Some packets are delivered out-of-order to the destination. • Some packets are delivered more than once (duplicates). • Some packets are dropped on the way (lost packets).
Case Study (Cont) • Performance: For packets delivered in order, • Time-rate-resource • Response time to deliver the packets • Throughput: the number of packets per unit of time. • Processor time per packet on the source end system. • Processor time per packet on the destination end systems. • Processor time per packet on the intermediate systems. • Variability of the response time Retransmissions • Response time: the delay inside the network
Case Study (Cont) • Out-of-order packets consume buffers Probability of out-of-order arrivals. • Duplicate packets consume the network resources Probability of duplicate packets • Lost packets require retransmission Probability of lost packets • Too much loss cause disconnection Probability of disconnect
Case Study (Cont) • Shared Resource Fairness • Fairness Index Properties: • Always lies between 0 and 1. • Equal throughput Fairness =1. • If k of n receive x and n-k users receive zero throughput: the fairness index is k/n.
Case Study (Cont) • Throughput and delay were found redundant ) Use Power. • Variance in response time redundant with the probability of duplication and the probability of disconnection • Total nine metrics.
Commonly Used Performance Metrics • Response time and Reaction time
Common Performance Metrics (Cont) • Nominal Capacity: Maximum achievable throughput under ideal workload conditions. E.g., bandwidth in bits per second. The response time at maximum throughput is too high. • Usable capacity: Maximum throughput achievable without exceeding a pre-specified response-time limit • Knee Capacity: Knee = Low response time and High throughput
Common Performance Metrics (cont) • Turnaround time = the time between the submission of a batch job and the completion of its output. • Stretch Factor: The ratio of the response time with multiprogramming to that without multiprogramming. • Throughput: Rate (requests per unit of time) Examples: • Jobs per second • Requests per second • Millions of Instructions Per Second (MIPS) • Millions of Floating Point Operations Per Second (MFLOPS) • Packets Per Second (PPS) • Bits per second (bps) • Transactions Per Second (TPS)
Common Performance Metrics (Cont) • Efficiency: Ratio usable capacity to nominal capacity. Or, the ratio of the performance of an n-processor system to that of a one-processor system is its efficiency. • Utilization: The fraction of time the resource is busy servicing requests. Average fraction used for memory.
Common Performance Metrics (Cont) • Reliability: • Probability of errors • Mean time between errors (error-free seconds). • Availability: • Mean Time to Failure (MTTF) • Mean Time to Repair (MTTR) • MTTF/(MTTF+MTTR)
Setting Performance Requirements • Examples: “ The system should be both processing and memory efficient. It should not create excessive overhead” “ There should be an extremely low probability that the network will duplicate a packet, deliver a packet to the wrong destination, or change the data in a packet.” • Problems: Non-Specific Non-Measurable Non-Acceptable Non-Realizable Non-Thorough SMART
Case Study 3.2: Local Area Networks • Service: Send frame to D • Outcomes: • Frame is correctly delivered to D • Incorrectly delivered • Not delivered at all • Requirements: • Speed • The access delay at any station should be less than one second. • Sustained throughput must be at least 80 Mbits/sec. • Reliability: Five different error modes. • Different amount of damage • Different level of acceptability.
Case Study (Cont) • The probability of any bit being in error must be less than 1E-7. • The probability of any frame being in error (with error indication set) must be less than 1%. • The probability of a frame in error being delivered without error indication must be less than 1E-15. • The probability of a frame being misdelivered due to an undetected error in the destination address must be less than 1E-18. • The probability of a frame being delivered more than once (duplicate) must be less than 1E-5. • The probability of losing a frame on the LAN (due to all sorts of errors) must be less than 1%.
Case Study (Cont) • Availability: Two fault modes – Network reinitializations and permanent failures • The mean time to initialize the LAN must be less than 15 milliseconds. • The mean time between LAN initializations must be at least one minute. • The mean time to repair a LAN must be less than one hour. (LAN partitions may be operational during this period.) • The mean time between LAN partitioning must be at least one-half a week.
Measurement Techniques and Tools Measurements are not to provide numbers but insight - Ingrid Bucher • What are the different types of workloads? • Which workloads are commonly used by other analysts? • How are the appropriate workload types selected? • How is the measured workload data summarized? • How is the system performance monitored? • How can the desired workload be placed on the system in a controlled manner? • How are the results of the evaluation presented?
Terminology • Test workload: Any workload used in performance studies.Test workload can be real or synthetic. • Real workload: Observed on a system being used for normal operations. • Synthetic workload: • Similar to real workload • Can be applied repeatedly in a controlled manner • No large real-world data files • No sensitive data • Easily modified without affecting operation • Easily ported to different systems due to its small size • May have built-in measurement capabilities.
Test Workloads for Computer Systems • Addition Instruction • Instruction Mixes • Kernels • Synthetic Programs • Application Benchmarks
Addition Instruction • Processors were the most expensive and most used components of the system • Addition was the most frequent instruction
Instruction Mixes • Instruction mix = instructions + usage frequency • Gibson mix: Developed by Jack C. Gibson in 1959 for IBM 704 systems.
Instruction Mixes (Cont) • Disadvantages: • Complex classes of instructions not reflected in the mixes. • Instruction time varies with: • Addressing modes • Cache hit rates • Pipeline efficiency • Interference from other devices during processor-memory access cycles • Parameter values • Frequency of zeros as a parameter • The distribution of zero digits in a multiplier • The average number of positions of preshift in floating-point add • Number of times a conditional branch is taken
Instruction Mixes (Cont) • Performance Metrics: • MIPS = Millions of Instructions Per Second • MFLOPS = Millions of Floating Point Operations Per Second
Kernels • Kernel = nucleus • Kernel= the most frequent function • Commonly used kernels: Sieve, Puzzle, Tree Searching, Ackerman's Function, Matrix Inversion, and Sorting. • Disadvantages: Do not make use of I/O devices
Synthetic Programs • To measure I/O performance lead analysts ) Exerciser loops • The first exerciser loop was by Buchholz (1969) who called it a synthetic program. • A Sample Exerciser: See program listing Figure 4.1 in the book
Synthetic Programs • Advantage: • Quickly developed and given to different vendors. • No real data files • Easily modified and ported to different systems. • Have built-in measurement capabilities • Measurement process is automated • Repeated easily on successive versions of the operating systems • Disadvantages: • Too small • Do not make representative memory or disk references • Mechanisms for page faults and disk cache may not be adequately exercised. • CPU-I/O overlap may not be representative. • Loops may create synchronizations ) better or worse performance.
Application Benchmarks • For a particular industry: Debit-Credit for Banks • Benchmark = workload (Except instruction mixes) • Some Authors: Benchmark = set of programs taken from real workloads • Popular Benchmarks
Sieve • Based on Eratosthenes' sieve algorithm: find all prime numbers below a given number n. • Algorithm: • Write down all integers from 1 to n • Strike out all multiples of k, for k=2, 3, …, n. • Example: • Write down all numbers from 1 to 20. Mark all as prime: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 • Remove all multiples of 2 from the list of primes: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20
Sieve (Cont) • The next integer in the sequence is 3. Remove all multiples of 3: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 • 5 >20 Stop • Pascal Program to Implement the Sieve Kernel:See Program listing Figure 4.2 in the book