Performance Evaluation Course ManijehKeshtgary Shiraz University of Technology Spring 1392
Goals of This Course • Comprehensive course on performance analysis • Includes measurement, statistical modeling, experimental design, simulation, and queuing theory • How to avoid common mistakes in performance analysis • Graduate course: (Advanced Topics) Lot of independent reading Project/Survey paper
Text Books • Required • Raj Jain. The Art of Computer Systems Performance Analysis: Techniques for Experimental Design, Measurement, Simulation, and Modeling, John Wiley and Sons, Inc., New York, NY, 1991. ISBN:0471503363
Grading Midterm 40% Final 40% HW & Paper 20%
Objectives: What You Will Learn • Specifying performance requirements • Evaluating design alternatives • Comparing two or more systems • Determining the optimal value of a parameter (system tuning) • Finding the performance bottleneck (bottleneck identification) • Characterizing the load on the system (workload characterization) • Determining the number and sizes of components (capacity planning) • Predicting the performance at future loads (forecasting)
WHAT IS PERFORMANCE EVALUATION • Performance evaluation is about quantifying the service delivered by a computer or communication system • For example, we might be interested in: comparing the power consumption of several server farm configurations • it is important to carefully define the load and the metric, and to be aware of the performance evaluation goals
Introduction • Computer system users, administrators, and designers are all interested in performance evaluation since the goal is to obtain or to provide the highest performance at the lowest cost. • A system could be any collection of HW, SW and firmware components; e.g., CPU, DB system, Network. • Computer performance evaluation is of vital importance • in the selection of computer systems, • the design of applications and equipment, • and analysis of existing systems.
Basic Terms • System: Any collection of hardware, software, and firmware • Metrics: Criteria used to evaluate the performance of the system components • Workloads: The requests made by the users of the system
Main Parts of the Course • Part I: An Overview of Performance Evaluation • Part II: Measurement Techniques and Tools • Part III: Probability Theory and Statistics • Part IV: Experimental Design and Analysis • Part V: Simulation • Part VI: Queuing Theory
Part I: An Overview of Performance Evaluation • Introduction • Common Mistakes and How To Avoid Them • Selection of Techniques and Metrics
Part II: Measurement Techniques and Tools • Types of Workloads • Popular Benchmarks • The Art of Workload Selection • Workload Characterization Techniques • Monitors • Accounting Logs • Monitoring Distributed Systems • Load Drivers • Capacity Planning • The Art of Data Presentation
Part III: Probability Theory and Statistics • Probability and Statistics Concepts • Four Important Distributions • Summarizing Measured Data By a Single Number • Summarizing The Variability Of Measured Data • Graphical Methods to Determine Distributions of Measured Data • Sample Statistics • Confidence Interval • Comparing Two Alternatives • Measures of Relationship • Simple Linear Regression Models
Part IV: Experimental Design and Analysis • Introduction to Experimental Design • One Factor Experiments
Part V: Simulation • Introduction to Simulation • Types of Simulations • Model Verification and Validation • Analysis of Simulation Results • Random-Number Generation • Testing Random-Number Generators • Commonly Used Distributions
Part VI: Queuing Theory • Introduction to Queueing Theory • Analysis of A Single Queue • Queuing Networks • Operational Laws • Mean Value Analysis and Related Techniques • Advanced Techniques
Purpose of Evaluation Threegeneral purposes of performance evaluation: • selection evaluation - system exists elsewhere • performance projection - system does not yet exist • performance monitoring - system in operation
Selection Evaluation • Evaluate plans to include performance as a major criterion in the decision to obtain a particular system from a vendor is the most frequent case • To determine among the various alternatives which are available and suitable for a given application • To choose according to some specified selection criteria • At least one prototype of the proposed system must exist
Performance Projection • Orientated towards designing a new system to estimate the performance of a system that does not yet exist • Secondary goal - projection of a given system on a new workload, i.e. modifying existing system in order to increase its performance or decrease it costs or both (tuning therapy) • Upgrading of a system - replacement or addition of one or more hardware components
Performance Monitoring • usually performed for a substantial portion of the lifetime of an existing running system • performance monitoring is done: • to detect bottlenecks • to predict future capacity shortcomings • to determine most cost-effective way to upgrade the system • to overcome performance problems, and • to cope with increasing workload demands
Evaluation Metrics • A computer system, like any other engineering machine, can be measured and evaluated in terms of how well it meets the needs and expectations of its users. • It is desirable to evaluate the performance of a computer system because we want to make sure that it is suitable for its intended applications, and that it satisfies the given efficiency and reliability requirements. • We also want to operate the computer system near its optimal level of processing power under the given resource constraints.
Three Basic Issues All performance measures deal with threebasic issues: • How quickly a given task can be accomplished, • How well the system can deal with failures and other unusual situations, and • How effectively the system uses the available resources.
Performance Measures We can categorize the performance measures as follows. • Responsiveness • Usage Level • Mission ability • Dependability • Productivity
Responsiveness • Responsiveness: These measures are intended to evaluate how quickly a given task can be accomplished by the system. Possible measures are: waiting time, Processing time, Queue length, etc.
Usage Level and Missionability: • Usage Level: These measures are intended to evaluate how well the various components of the system are being used. • Possible measures are: Throughput and utilization of various resources. • Missionability: These measures indicate if the system would remain continuously operational for the duration of a mission. Possible measures : interval availability (Probability that the system will keep performing satisfactorily throughout the mission time) and life-time (time when the probability of unacceptable behaviour increases beyond some threshold). These measures are useful when repair/tuning is impractical or when unacceptable behavior may be catastrophic
Dependability: These measures indicate how reliable the system is over the long run. Possible measures are: • Number of failures/day. • MTTF(mean time to failure). • MTTR(mean time to repair). • Long-term availability, and cost of a failure. • These measures are useful when repairs are possible and failures are tolerable.
Productivity These measures indicate how effectively a user can get his or her work accomplished. Possible measures are: • User friendliness. • Maintainability. • And understandability.
Which measures for what system • The relative importance of various measures depends on the application involved. • In the following, we provide a broad classification of computer systems according to the application domains, indicating which measures are most relevant: • General purpose computing: These systems are designed for general purpose problem solving. • Relevant measures are: responsiveness, usage level. and productivity.
2. High availability • Such systems are designed for transaction processing environments: (bank, Airline. Or telephone databases. Switching systems. etc.). • The most important measures are responsiveness and dependability. • Both of these requirements are more severe than for general purpose computing systems, moreover, any data corruption or destruction is unacceptable. • Productivity is also an important factor.
3. Real-time control • Such systems must respond to both periodic and randomly occurring events within some(possibly hard) timing constraints. • They require high levels of responsiveness and dependability for most workloads and failure types and are therefore significantly over-designed. • Utilization and Throughput play little role in such systems
.4Mission Oriented These systems require extremely high levels of reliability over a short period, called the mission time. Little or no repair/tuning is possible during the mission. Such systems include fly-by-wire airplanes, battlefield systems, and spacecrafts. Responsiveness is also important, but usually not difficult to achieve. Such systems may try to achieve high reliability during the short term at the expense of poor reliability beyond the mission period.
5. Long-life Systems like the ones used for unmanned spaceships need long life without provision for manual diagnostics and repairs. Thus. In addition to being highly dependable, they should have considerable intelligence built in to do diagnostics and repair either automatically or by remote control from aground station. Responsiveness is important but not difficult to achieve.
.1.2Techniques of Performance Evaluation • measurement. • simulation • analytic modeling • The latter two techniques can also be combined to get what is usually known as hybrid modeling.
Measurement • Measurement is the most fundamental technique and is needed even in analysis and simulation to calibrate the models. • Some measurements are best done in hardware, some in software, and some in a hybrid manner.
Simulation Modeling • Simulation involves constructing a model for the behavior of the system and driving it with an appropriate abstraction of the workload. • The major advantage of simulation is its generality and flexibility, almost any behavior can be easily simulated • However, there are many important issues that must be considered in simulation: 1. It must be decided what not to simulate and at what level of detail. Simply duplicating the detailed behavior of the system is usually unnecessary and prohibitively expensive. 2. Simulation, like measurement. Generates much raw data. Which must be analyzed using statistical techniques. 3. Similar to measurements. A careful experiment design is essential to keep the simulation cost down.
Analytic Modeling Analytic modeling involves constructing a mathematical model of the system behavior(at the desired level of detail) and solving it. The main difficulty here is that the domain of tractable models is rather limited. Thus, analytic modeling will fail if the objective is to study the behavior in great detail. However. For an overall behavior characterization, analytic modeling is an excellent tool.
Advantages of Analytical model over other two • It generates good insight into the workings of the system that is valuable even if the model is too difficult to solve. • Simple analytic models can usually be solved easily, yet provide surprisingly accurate results, and • Results from analysis have better predictive value than those obtained from measurement or simulation
Hybrid Modeling • A complex model may consist of several submodels, each representing certain aspect of the system. • Only some of these submodels may be analytically tractable, the others must be simulated. • For example, the fraction of memory wasted due to fragmentation may be difficult to estimate analytically, even though other aspects of system can be modeled analytically.
Hybrid Model (cont) We can take the hybrid approach. Which will proceed as follows: • Solve the analytic model assuming no fragmentation of memory and determine the distribution of memory holding time. • Simulate only memory allocation, Holding, And deallocation and determine the average fraction of memory that could not be used because of fragmentation. • Recalibrate the analytic model of step 1 with reduced memory and solve it. (It may be necessary to repeat these steps a few times to get convergence.)
1.3 Applications of Performance Evaluation • System design • System Selection • System Upgrade • System Tuning • System Analysis
System Design In designing a new system. One typically starts out with certain performance/ reliability objectives and a basic system architecture. And then decides how to choose various parameters to achieve the objectives. This involves constructing a model of the system behavior at the appropriate level of detail, and evaluating it to choose the parameters. Analytical model is ok if we just want to eliminate bad choices and simulation if we need more details.
System Selection • Here the problem is to select the “best” system from among a group of systems that are under consideration for reasons of cost, availability, compatibility, etc. • Although direct measurement is the ideal technique to use here. There might be practical difficulties in doing so (e.g., not being able to use them under realistic workloads, or not having the system available locally). • Therefore, it may be necessary to make projections based on available data and some simple modeling.
System Upgrade • This involves replacing either the entire system or parts there of with a newer but compatible unit. • The compatibility and cost considerations may dictate the vendor, • So the only remaining problem is to choose quantity, speed, and the like. • Often, analytic modeling is adequate here; • however, in large systems involving complex interactions between subsystems. Simulation modeling may be essential
System Tuning • The purpose of tune up is to optimize the performance by appropriately changing the various resource management policies. • Some examples are process scheduling mechanism, Context switching, buffer allocation schemes, cluster size for paging, and contiguity in file space allocation. • It is necessary to decide which parameters to consider changing and how to change them to get maximum potential benefit. • Direct experimentation is the simplest technique to use here, but may not be feasible in a production environment. • Analytical model can not present its changes, so simulation is better
System Analysis • Suppose that we find a system to be unacceptably sluggish. • The reason could be either inadequate hardware resources(CPU, memory, disk, etc.) or poor system management. • In the former case, we need system upgrade, and in the latter, a system tune up. • Nevertheless, the first task is to determine which of the two cases applies. • This involves monitoring the system and examining the behavior of various resource management policies under different loading conditions. • Experimentation coupled with simple analytic reasoning is usually adequate to identify the trouble spots; however, in some cases, complex interactions may make a simulation study essential.
Performance Evaluation Metrics • Performance metrics can be categorised into three classes based on their utility function: • Higher is Better or HB • Lower is Better or LB • Nominal is Best or NB LB NB HB metric (e.g., response time) metric (e.g., throughput) metric (e.g., utilisation)
Outline • Objectives • What kind of problems will you be able to solve after taking this course? • The Art • Common Mistakes • Systematic Approach • Case Study
Objectives (1 of 6) • Select appropriate evaluation techniques, performance metrics and workloads for a system • Techniques: measurement, simulation, analytic modeling • Metrics: criteria to study performance (ex: response time) • Workloads: requests by users/applications to the system • Example: What performance metrics should you use for the following systems? • a) Two disk drives • b) Two transactions processing systems • c) Two packet retransmission algorithms
Objectives (2 of 6) • Conduct performance measurements correctly • Need two tools • Load generator – a tool to load the system • Monitor – a tool to measure the results • Example: Which type of monitor (software and hardware) would be more suitable for measuring each of the following quantities? • a) Number of instructions executed by a processor • b) Degree of multiprogramming on a timesharing system • c) Response time of packets on a network
Objectives (3 of 6) • Use proper statistical techniques to compare several alternatives • Find the best among a number of alternatives • One run of workload often not sufficient • Many non-deterministic computer events that effect performance • Comparing average of several runs may also not lead to correct results • Especially if variance is high • Example: Packets lost on a link. Which link is better? File Size Link A Link B 1000 5 10 1200 7 3 1300 3 0 50 0 1