Chapter 13

Chapter 13 Queueing Models

Introduction • We all spend a great deal of time waiting in lines (queues). • People are not the only entities that wait in queues. • Manufacturing facilities, computer networks, etc., all have queues. • Mathematically, it does not really matter whether the entities waiting are people or televisions or computer messages. • The same type of analysis applies to all of these.

Introduction continued • The purpose of such an analysis is generally twofold. • The first objective is to examine an existing system to quantify its operating characteristics. • The second objective is to learn how to make a system better. • The first objective, analyzing the characteristics of a given system, is difficult from a mathematical point of view.

Introduction continued • The two basic modeling approaches are analytical and simulation. • The analytical approach searches for mathematical formulas that describe the operating characteristics of the system, usually in “steady state.” • The mathematical models are typically too complex to solve unless simplifying (and sometimes unrealistic) assumptions are made. • With the second approach, simulation, much more complex systems can be analyzed without making many simplifying assumptions. • However, the drawback to queueing simulation is that it usually requires specialized software packages or trained computer programmers to implement.

Introduction continued • In this chapter, we employ both the analytical approach and simulation. • For the former, we discuss several well-known queueing models that describe some—but certainly not all—queueing situations in the real world. • These analytical models generally require simplifying assumptions, and even then they can be difficult to understand. • The second objective in many queueing studies is optimization, where the goal is to find the “best” system.

Introduction continued • This chapter is very different from earlier chapters because of the nature of queueing systems. • The models in previous chapters could almost always be developed from “first principles.” • By using relatively simple formulas involving functions such as SUM, SUMPRODUCT, IF, and so on, it was fairly straightforward to convert inputs into outputs. • This is no longer possible with queueing models. • The inputs are typically mean customer arrival rates and mean service times. • The required outputs are typically mean waiting times in queues, mean queue lengths, the fraction of time servers are busy, and possibly others.

Introduction continued • Deriving the formulas that relate the inputs to the outputs is mathematically very difficult, well beyond the level of this book. • Therefore, many times in this chapter you will have to take our word for it. • Nevertheless, the models we illustrate are very valuable for the important insights they provide.

Elements of queueing models • Almost all queueing systems are alike in that customers enter a system, possibly wait in one or more queues, get served, and then depart. • This general description of a queueing system - customers entering, waiting in line, and being served - hardly suggests the variety of queueing systems that exist.

Characteristics of arrivals • First, the customer arrival process must be specified. • This includes the timing of arrivals as well as the types of arrivals. • Regarding timing, specifying the probability distribution of interarrival times, the times between successive customer arrivals, is most common. • These interarrival times might be known—that is, nonrandom. • Much more commonly, however, interarrival times are random with a probability distribution.

Characteristics of arrivals continued • Regarding the types of arrivals, there are at least two issues. • First, customers can arrive one at a time or in batches. • Another issue is whether (or how long) customers will wait in line. • A customer might arrive to the system, see that too many customers are waiting in line, and decide not to enter the system at all. This is called balking. • A variation of balking occurs when the choice is made by the system, not the customer. We call this a limited waiting room system. • Another type of behavior, called reneging, occurs when a customer already in line becomes impatient and leaves the system before starting service.

Service discipline • When customers enter the system, they might have to wait in line until a server becomes available. In this case, the service discipline must be specified. • The service discipline is the rule that states which customer, from all who are waiting, goes into service next. • The most common service discipline is first-come-first-served (FCFS), where customers are served in the order of their arrival.

Service discipline continued • However, other service disciplines are possible, including service-in-random-order (SRO), last-come-first-served (LCFS), and various priority disciplines (if there are customer classes with different priorities). • For example, a type of priority discipline used in some manufacturing plants is called the shortest-processing-time (SPT) discipline. • One other aspect of the waiting process is whether there is a single line or multiple lines. • For example, most banks now have a single line. An arriving customer joins the end of the line. • In contrast, most supermarkets have multiple lines.

Service characteristics • In the simplest systems, each customer is served by exactly one server, even when the system contains multiple servers. • The service times typically vary in some random manner, although constant (nonrandom) service times are sometimes possible. • When service times are random, the probability distribution of a typical service time must be specified. • This probability distribution can be the same for all customers and servers, or it can depend on the server and/or the customer.

Service characteristics continued • In a situation like the typical bank, where customers join a single line and are then served by the first available teller, the servers (tellers) are said to be in parallel.

Service characteristics continued • A different type of service process is found in many manufacturing settings. • For example, various types of parts (the “customers”) enter a system with several types of machines (the “servers”). Each part type then follows a certain machine routing, such as machine 1, then machine 4, and then machine 2. • Each machine has its own service time distribution, and a typical part might have to wait in line behind any or all of the machines on its routing. • This type of system is called a queueing network.

Service characteristics continued • The simplest type of queueing network is a series system, where all parts go through the machines in numerical order: first machine 1, then machine 2, then machine 3, and so on (see below).

Short-run vs. steady-state behavior • If you run a fast-food restaurant, you are particularly interested in the queueing behavior during your peak lunchtime period. • The customer arrival rate during this period increases sharply, and you probably employ more workers to meet the increased customer load. • In this case, your primary interest is in the short-run behavior of the system - the next hour or two. • Unfortunately, short-run behavior is the most difficult to analyze, at least with analytical models.

Short-run vs. steady-state behavior continued • But where is the line drawn between the short run and the long run? The answer depends on how long the effects of initial conditions persist. • Analytical models are best suited for studying long-run behavior. This type of analysis is called steady-state analysis and is the focus of much of the chapter. • One requirement for steady-state analysis is that the parameters of the system remain constant for the entire time period. • In particular, the arrival rate must remain constant.

Short-run vs. steady-state behavior continued • Another requirement for steady-state analysis is that the system must be stable. • This means that the servers must serve fast enough to keep up with arrivals - otherwise, the queue can theoretically grow without limit.

The exponential distribution • Queueing systems generally contain uncertainty. • Specifically, times between customer arrivals (interarrival times) and customer service times are generally modeled as random variables. • The most common probability distribution used to model these uncertain quantities is the exponential distribution. • Many queueing models can be analyzed in a fairly straightforward manner, even on a spreadsheet, if exponentially distributed interarrival times and service times are assumed.

The exponential distribution continued • A random variable X has an exponential distribution with parameter λ (with λ > 0) if the density function for X has the form for all x >0 • λis the Greek letter lambda. Its use is standard in the queueing literature. • The mean and standard deviation of this distribution are easy to remember. They are both equal to the reciprocal of the parameter λ.

The exponential distribution continued • The random variable X is always expressed in some time unit, such as minutes. • For example, X might be the number of minutes it takes to serve a customer. • Now, suppose that the mean service time is three minutes. Then 1/λ =3, so that λ= 1/3. • For this reason, λ can be interpreted as a rate—in this case, one customer every three minutes (on average).

The memoryless property • The property that makes the exponential distribution so useful in queueing models (and in many other management science models) is called the memoryless property, which can be stated as follows. • The probability on the left is a conditional probability, the probability that X is greater than x+h, given that it is greater than x.

The memoryless property • The exponential distribution is attractive from a mathematical point of view. • If a process is observed at any time, all exponential times (interarrival times and service times, say) essentially “start over” probabilistically—you do not have to know how long it has been since various events (the last arrival or the beginning of service) occurred. • The exponential distribution is the only continuous probability distribution with this property. • On the negative side, however, this strong memoryless property makes the exponential distribution inappropriate for many real applications.

Example 13.1:Estimating interarrival and service times • A bank manager would like to use an analytical queuing model to study the congestion at the bank’s automatic teller machines (ATMs). • A simple model of this system requires that the interarrival times and service times are exponentially distributed. • During a period of time when business is fairly steady, several employees use stopwatches to gather data on interarrival times and service times.

Example 13.1 continued:Background information • These are listed in the table on the next slide. • The bank manager wants to know, based on these data, whether it is reasonable to assume exponentially distributed interarrival times and service times. • In each case he also wants to know the appropriate value of .

Example 13.1 continued:Interrarival and service times for the ATM

Example 13.1 continued:Exponential Fit.xlsx • To see whether these times are consistent with the exponential distribution, we plot histograms of the interarrival times and the service times. • The histograms appear on the next slide. • This file contains the model.

Example 13.1 continued:Histograms

Example 13.1 continued:Solution • The histogram of the interarrival times appears to be quite consistent with the exponential density shown. • Its highest bar is at the left, and the remaining bars fall off gradually from left to right. • On the other hand, the histogram of the service times is not shaped like the exponential density. • Its highest bar is at the left, and the remaining bars fall off gradually from left to right.

Example 13.1 continued:Solution • There is some minimum time required to process any customer, regardless of the task, so that the most likely times are not close to 0. • Therefore, the exponential assumption for interarrival times is reasonable, but it is questionable for service times. • In either case, if the manager decides to accept the exponential assumption, the parameter  is the rate of arrivals and is estimated by the reciprocal of the average of the observed times.

Example 13.1 continued:Solution • For interarrival times, this estimate of  is the reciprocal of the average in cell C39: 1/25.3 = 0.0395 – that is, 1 arrival every 25.3 seconds. • For service times, the estimated  is the reciprocal of the average in cell G39: 1/22.3 = 0.0448 – that is, 1 service every 22.3 seconds.

The Poisson process model • When the interarrival times are exponentially distributed, we often state that “arrivals occur according to a Poisson process.” • There is a close relationship between the exponential distribution, which measures times between events such as arrivals, and the Poisson distribution, which counts the number of events in a certain length of time. • If customers arrive at a bank according to a Poisson process with rate one every three minutes, this implies that the interarrival times are exponentially distributed with parameter λ= 1/3.

Important queueing relationships • There are several very useful and general relationships that hold for a wide variety of queueing models. • We briefly discuss them here so that they can be used in the queueing models in later sections. • We typically calculate two general types of outputs in a queueing model: time averages and customer averages.

Typical time averages • Typical time averages are: • L, the expected number of customers in the system • LQ, the expected number of customers in the queue • LS, the expected number of customers in service • P(all idle), the probability that all servers are idle • P(all busy), the probability that all servers are busy • If you were going to estimate the quantity LQ, for example, you might observe the system at many time points, record the number of customers in the queue at each time point, and then average these numbers. In other words, you would average this measure over time.

Typical customer averages • Typical customer averages are: • W, the expected time spent in the system (waiting in line or being served) • WQ, the expected time spent in the queue • WS, the expected time spent in service • To estimate the quantity WQ, for example, you would observe many customers, record the time in queue for each customer, and then average these times over the number of customers observed. Now you are averaging over customers.

Little’s formula • Little’s formula is a famous formula that relates time averages and customer averages in steady state. This formula was first discovered by John D.C. Little. • Consider any queueing system. Let λ be the average rate at which customers enter this system, let L be the expected number of customers in the system, and let W be the expected time a typical customer spends in the system. Then Little’s formula can be expressed as L=λW

Other relationships • Two other formulas relate these quantities. First, all customers are either in service or in the queue, which leads to the following equation: L =LQ+LS • A second useful formula is the following: W=WQ+WS • One final important queueing measure is called the server utilization. The server utilization, denoted by U, is defined as the long-run fraction of time a typical server is busy. • In a multiple-server system, where there are s identical servers in parallel, server utilization is defined as U=LS/S

Analytical stead-state queueing models • In this section, we discuss several analytical models for queueing systems. • As stated earlier, these models cannot be developed without a fair amount of mathematical background - more than is assumed in this book. • Therefore, we must rely on the queueing models that have been developed in the management science literature.

The basic single-server model • We begin by discussing the most basic single-server model, labeled the M/M/1 model. • This shorthand notation, developed by Kendall, implies three things. • The first M implies that the distribution of interarrival times is exponential. • The second M implies that the distribution of service times is also exponential. • Finally, the “1” implies that there is a single server. • Customarily, λ denotes the arrival rate, and μ denotes the service rate. • Example 13.2 illustrates this model.

Example 13.2:Background information • The Smalltown postal branch employs a single clerk. • Customers arrive at this postal branch according to a Poisson process at rate 30 customers per hour, and the average service time is exponentially distributed with mean 1.5 minutes. • All arriving customers enter the branch, regardless of the number already waiting in line.

Example 13.2 continued:Background information • The manager of the postal branch would ultimately like to decide whether to improve the system. • To do this, she first needs to develop a queuing model that describes the steady-state characteristics of the current system.

Example 13.2 continued:Solution • To begin, we must choose a common unit of time and then express the arrival and service rates ( and ) in this unit. • We could measure time in seconds, minutes, hours, or any other convenient time unit, as long as we are consistent. • Here we will choose minutes. Then, because 1 customer arrives every 2 minutes,  = ½. Also,  = 0.667.

Example 13.2 continued:Solution • For the M/M/1 model, it turns out that the server utilization equals /, the arrival rate divided by the service rate. • To ensure that the system is stable, we must require that the server utilization is less than 1, so that the arrival rate is less than the services rate. • Otherwise, waiting lines will tend to grow indefinitely in the long run. • In general, the formulas for the M/M/1 model are somewhat complex. Therefore, we have implemented them in an M/M/1 “template” file.

Example 13.2 continued:MM1 Template.xlsx • This file contains the model. • The template is shown here.

Example 13.2 continued:Solution • We will not provide step-by-step instructions because we expect that you will use this as a template rather than enter the formulas yourself. • However, we make the following points. • All you need to enter are the inputs in B4 through B6. • You can enter numbers for the rates in cells B5 and B6, or you can base these on observed data.

Example 13.2 continued:Solution • The value of L in cell B15 is calculated from equation. Then the values in cells B5, B15, and B17 are related by the equation version of Little’s formula. • The steady-state probabilities in column F are based on equation. You can copy these down as far as you like, until the probabilities are negligible. • The waiting time probability in cell I11 is calculated from equation (14.11). You can enter any time t in cell H11 to obtain the probability that a typical customer will wait in the queue at least this amount of time. Alternatively, you can enter other values of t in cells H12, H13, and so on, and then copy the formula in cell I11 down to calculate other waiting time probabilities.

Example 13.2 continued:Solution • From the template we see, for example, that when the arrival rate is 0.5 and the service rate is 0.667, the expected number of customers in the queue is 2.25 and the expected time a typical customer spends in the queue is 4.5 minutes. • However 25% of all customers spend no time in the queue, while 53.7% spend more than 2 minutes in the queue. • Also, we see that the steady probability of having exactly 4 customers in the system is 0.079. Equivalently, there are exactly 4 customers in the system 7.9% of the time.

Example 13.2 continued:Solution • The bank manager can experiment with other arrival rates or services rates in cells B5 and B6 to see how the various output measures are affected. • One particularly important insight can be obtained through a data table shown on the next slide. • The current server utilization is 0.75, and the system is behaving fairly well, with short waits in queue on average. • The data table, however, shows how bad things can get when the service rate is just barely above the arrival rate, so that the server utilization is just barley below 1.

Example 13.2 continued:Effect of varying service rate

Chapter 13

Chapter 13

Presentation Transcript

CHAPTER 13

CHAPTER 13

Chapter 13

chapter 13

Chapter 13

Chapter 13

Chapter 13

Chapter 13

Chapter 13

Chapter 13

Chapter 13

Chapter 13

Chapter 13

Chapter 13

Chapter 13

Chapter 13

Chapter 13

Chapter 13

Chapter 13

Chapter 13

Chapter 13

Chapter 13