Probability Essentials

Probability Essentials • Concept of probability is quite intuitive; however, the rules of probability are not always intuitive or easy to master. • Mathematically, a probability is a number between 0 and 1 that measures the likelihood that some event will occur. • An event with probability zero cannot occur. • An event with probability 1 is certain to occur. • An event with probability greater than 0 and less than 1 involves uncertainty, but the closer its probability is to 1 the more likely it is to occur.

Rule of Complement • The simplest probability rule involves the complement of an event. • If A is any event, then the complement of A, denoted by Ac, is the event that A does not occur. • If the probability of A is P(A), then the probability of its complement, P(Ac), is P(Ac)=1- P(A). • Equivalently, the probability of an event and the probability of its complement sum to 1.

Addition Rule • We say that events are mutually exclusive if at most one of them can occur. That is, if one of them occurs, then none of the others can occur. • Events can also be exhaustive, which means that they exhaust all possibilities - one of these three events must occur. • Let A1 through An be any n events. Then the addition rule of probability involves the probability that at least one of these events will occur. P(at least one of A1 through An) = P(A1) + P(A2) +  + P(An)

Conditional Probability • Probabilities are always assessed relative to the information currently available. As new information becomes available, probabilities often change. • A formal way to revise probabilities on the basis of new information is to use conditional probabilities. • Let A and B be any events with probabilities P(A) and P(B). Typically the probability P(A) is assessed without knowledge of whether B does or does not occur. However if we are told B has occurred, the probability of A might change.

Conditional Probability -- continued • The new probability of A is called the conditional probability of A given B. It is denoted P(A|B). • Note that there is uncertainty involving the event to the left of the vertical bar in this notation; we do not know whether it will occur or not. However, there is no uncertainty involving the event to the right of the vertical bar; we know that it has occurred. • The following formula conditional probability formula enables us to calculate P(A|B):

Multiplication Rule • In the conditional probability rule the numerator is the probability that both A and B occur. It must be known in order to determine P(A|B). • However, in some applications P(A|B) and P(B) are known; in these cases we can multiply both side of the conditional probability formula by P(B) to obtain the multiplication rule. P(A and B) = P(A|B)P(B) • The conditional probability formula and the multiplication rule are both valid; in fact, they are equivalent.

Assessing the Bendrix Situation • Now that we are familiar with the a number of probability rules we can put them to work in assessing the Bendrix situation. • To begin we will let A be the event that Bendrix meets its end-of-July deadline, and let B be the event that Bendrix receives the materials form its supplier by the middle of July. • The probabilities that we are best able to be assess on July 1 are probably P(B) and P(A|B).

Assessing -- continued • They estimate a 2 in 3 chance of getting the materials on time; thus P(B)=2/3. • They also estimate that if they receive the materials on time then the chances of meeting the deadline are 3 out of 4. This is a conditional probability statement that P(A|B)=3/4. • We can use the multiplication rule to obtain: P(A and B) = P(A|B)P(B) = (3/4)(2/3) = 0.5 • There is a 50-50 chance that Bendrix will gets its materials on time and meet its deadline.

Assessing -- continued • Other probabilities of interest exist in this example. • Let Bc be the complement of B; it is the event that the materials from the supplier do not arrive on time. We know that P(B) = 1 - P(Bc) = 1/3 from the rule of complements. • Bendrix estimates that the chances of meeting the deadline are 1 out of 5 if the materials do not arrive on time, that is, P(A| Bc) = 1/5. The multiplication rule gives P(A and Bc) = P(A| Bc)P(Bc) = (1/5)(1/3) = 0.0667

Assessing -- continued • In words, there is a 1 chance out of 15 that the materials will not arrive on time and Bendrix will meets its deadline. • The bottom line for Bendrix is whether it will meet its end-of-July deadline. After the middle of July the probability is either 3/4 or 1/5 because by this time they will know whether the materials have arrived on time. • But since it is July 1 the probability is P(A) - there is still uncertainty about whether B or Bc will occur.

Assessing -- continued • We can calculate P(A) from the probabilities we already know. Using the additive rule for mutually exclusive events we obtain P(A) =P(A and B) + P(A and Bc) = (1/2)+(1/15) = 0.5667 • In words, the chances are 17 out of 30 that Bendrix will meet its end-of-July deadline, given the information it has at the beginning of July.

Probabilistic Independence • A concept that is closely tied to conditional probability is probabilistic independence. • There are situations unlike Bendix when P(A), P(A|B) and P(A| Bc) are not all different. They are situations where these probabilities are all equal. In this case we can say that events A and B are independent. • This does not mean they are mutually exclusive; it means that knowledge of one of the events is of no value when assessing the probability of the other event.

Probabilistic Independence -- continued • The main advantage of knowing that two events are independent is that the multiplication rule simplifies to P(A and B) = P(A)P(B) • In order to determine if events are probabilistically independent we usually cannot use mathematical arguments; we must use empirical data to decide whether independence is reasonable.

Distribution of a Single Random Variable

Background Information • An investor is concerned with the market return for the coming year, where the market return is defined as the percentage gain (or loss, if negative) over the year. • The investor believes there are five possible scenarios for the national economy in the coming year: rapid expansion, moderate expansion, no growth, moderate contraction, or serious contraction. • She estimates that the market returns for these scenarios are, respectively, 0.23, 0.18, 0.15, 0.09, and 0.03.

Background Information -- continued • Also, she has assessed that the probabilities of these outcomes are 0.12, 0.40, 0.25, 0.15, and 0.08. • We must use this information to describe the probability distribution of the market return.

Type of Random Variables • A discrete random variable has only a finite number of possible values. • A continuous random variable has a continuum of possible values. • Mathematically, there is an important difference between discrete and continuos random variables. A proper treatment of continuos variables requires calculus. In this book we will only be dealing with discrete random variables.

Discrete Random Variables • The properties of discrete random variables and their associated probability distributions are as follows: • Let X be a random variable and to specify the probability distribution of X we need to specify its possible values and their probabilities. This list of their probabilities sum to 1. • It is sometimes useful to calculate cumulative probabilities. A cumulative probability is the probability that the random variable is less than or equal to some particular values.

Summarizing a Probability Distribution • A probability distribution can be summarized with two or three well-chosen numbers: • The mean, often called the expected value, is a weighted sum of the possibilities. It indicates the center of the probability distribution. • To measure the variability in a distribution, we calculate its variance or standard deviation. The variance is a weighted sum of the squared deviations of the possible values from the mean. As in the previous chapter the variance is represented in the squared units of X so a more natural measure of variability is the standard deviation.

MRETURN.XLS • This file contains the values and probabilities estimated by the investor in this example. • Mean, Probs, Returns, Var and Sqdevs have been specified as range names.

Calculating Summary Measures • The summary measures for the probability distribution of the outcomes can be calculated as follows: • Mean return: =SUMPRODUCT(Returns,Probs) • Squared Deviations: =(C4-Mean)^2 • Variance: =SUMPRODUCT(SqDevs,Probs) • Standard Deviation: =SQRT(Var) • We see that the mean return is 15.3% and the standard deviation is 5.3%. What do these mean?

Analyzing the Summary Measures • First, the mean or expected return does not imply that the most likely return is 15.3%, nor is this the value that the investor “expects” to occur. The value 15.3% is not even a possible market return. • We can understand these measures better in terms of long-run averages. • If we can see the coming year repeated many times, using the same probability distribution, then the average of these times would be close to 15.3% and their standard deviation would be 5.3%.

Derived Probability Distributions

Background Information • A bookstore is planning on ordering a shipment of special edition Christmas calendars that they will sell for $15 a piece. • There will be only one order, so • if demand is less than the quantity ordered the excess calendars will be donated to a paper recycling company • if demand is greater than the quantity ordered, the excess demand will be lost and customers will take their business elsewhere • The bookstore estimates that the demand for calendars will be between 250 and 400.

DERIVED.XLS • This file contains the probability distribution that the demand for calendars will follow. These estimates have been derived from subjective estimates and historical data. • If the bookstore decides to order 350 calendars, what is the probability distribution of units sold? What is the probability distribution of revenue?

Derived Distributions of Units Sold and Revenue

Solution • Let D, S,and R denote demand, units sold, and revenue. • The key to the solution is that each value of D directly determines the value of S, which in turn determines the value of R. • S is the smaller of D and the number ordered, 350, and R is $15 multiplied by the value of S. • Therefore we can derive the probability distributions of S and R with the following steps:

Solution -- continued • Calculate Units sold : =MIN(B10,OnHand) • Calculate Revenue for each value of units sold: =UnitPrice*B20 • Transfer the Derived Probabilities for demand: =C10 • Calculate Means of demand, units sold, and revenue: =SUMPRODUCT(Revenues, DerivedProbs) • Calculate the Variances and Standard Deviations of demand, units sold and revenue. • First, calculate the squared deviations of revenues from their mean in Column F, then calculate the sum of the products of these squared deviations and the revenue probabilities to obtain the variance of revenue. Finally, calculate the standard deviation as the square root of the variance.

Summary Measures for Linear Functions • When one random variable is a linear function of another random variable X, there is a particularly simple way to calculate the summary measures of Y from the Summary measures of X. Y = a + bX for some constant a and b then: • mean: E(Y) = a + bE(X) • variance: Var(Y) = b2 Var(X) • standard deviation: bStdev(X) • If Y is a constant multiple of X, that is a=0 then the mean and standard deviation of Y are this same multiple of the mean and standard deviation of X.

Distribution of Two Random Variables: Scenario Approach

Background Information • An investor plans to invest in General Motors (GM) stock and gold. • He assumes that the returns on these investments over the next year depend on the general state of the economy during the year. • He identifies four possible states of the economy: depression, recession, normal and boom. These four states have the following probabilities: 0.05, 0.30, 0.50, and 0.15.

Background Information -- continued • The investor wants to analyze the joint distribution of returns on these two investments. • He also wants to analyze the distribution of a portfolio of investments in GM stock and gold.

GMGOLD.XLS • This file contains the probabilities and estimated returns of the GM stock and the gold.

Relating Two Random Variables • There are two methods for relating two random variables, the scenario approach and the joint probability approach. • The methods differ slightly in the way they assign probabilities to different outcomes. • Two summary measures, covariance and correlation, are used to measure the relationship between two variables in both methods.

The Summary Measures • We have discussed summary measures with the same names, covariance and correlation, earlier. The summary measures we are looking at now go by the same name but are conceptually different. • In the past we have calculated them from data; here they are calculated from a probability distribution. • The random variables are X and Y and the probability that X and Y equal xi and yi is p(xi, yi) is called a joint probability.

Summary Measures -- continued • Although they are calculated differently , the interpretation is essentially the same as we previously discussed. • Each indicates the strength of a linear relationship between X and Y. If X and Y vary in the same direction then both measures are positive. If they vary in opposite directions then both measures are negative. • Covariance is more difficult to interpret because it depends on the units of measurement of X and Y. Correlation is always between -1 and +1.

The Scenario Approach • The essence of the scenario approach in this example is that a given state of the economy determines both GM and gold returns, so that only four pairs of returns are possible. • These pairs are -0.20 and 0.05, 0.10 and 0.20, 0.30 and -0.12, and 0.50 and 0.09. Each pair has a joint probability. • To calculate means, variances and standard deviations, we treat GM and gold returns separately.

Calculating Covariance and Correlation • We also need to calculate the covariance and correlation between the variables. To obtain these we use the following steps: • Deviations between means: To calculate the covariance we need the sum of deviations from means, so we need to calculate these deviations with the formula =C4-GMMean in B14 and copy it down through B17. We also calculate this for gold. • Covariance: Calculate the covariance between GM and gold returns in cell B23 with the formula=SUMPRODUCT(GMDevs,GoldDevs,Probs)

Calculating Covariance and Correlation -- continued • Correlation: Calculate the correlation between GM and gold returns in cell B24 with the formula =Covar/(GMStdv*GoldStedev) • The negative covariance indicates that GM and gold returns tend to vary in opposite directions, although it is difficult to judge the strength by the magnitude of the covariance. • The correlation of -0.410 is also negative and indicates a moderately strong relationship. We cannot infer too much from this correlation though because the variables are not linear.

Simulation • A simulation of GM and gold returns help explain the covariance and correlation. • There are two keys to this simulation: • First we must, simulate the states of the economy, not - at least not directly - the GM and gold returns. • We simulate this be entering a RAND function in A1 and then by entering the formulas VLOOKUP(A21,LTable,2) in B21 and VLOOKUP(A21,LTable,3) in C21. • This way uses the same random number, hence the same scenario, to generate both returns in a given row, and the effect is that only four pairs of returns are possible.

Simulation -- continued • Second, once we have the simulated returns we can calculate the covariance and correlation of these numbers. • We calculate these in cells B8 and B9 with the formulas COVAR(SimGM,SimGOLD) and CORREL(SimGM,SimGold). These are built-in Excel functions. • A comparison of these summary measures with the previously calculated summary measures shows that there is reasonably good agreement between the covariance and correlation of the probability distribution and the measures based on the simulated values. The agreement is not perfect but will improve as more pairs are simulated.

Simulation of GM and Gold Returns

Portfolio Analysis • The final part of this example is to analyze a portfolio consisting of GM stock and gold. • We assume that the investor has $10,000 and puts some fraction of this in GM stock and the rest in gold. • The key to the analysis is that there are only four possible scenarios -- that is, there are only four possible portfolio returns. • In this case we calculate the entire portfolio return distribution and summary measures in the usual way.

Portfolio Analysis -- continued • One thing of interest is to see how the expected portfolio return and standard deviation of portfolio return change as the amount the investor puts into GM stock changes. • To do this we use a data table or mean and stdev of portfolio return as a function of GM investment. • A graph of these measures show that the expected portfolio return steadily increases as more and more is put into GM.

Portfolio Analysis -- continued • However, we must note that the standard deviation, often used as a measure of risk, first decreases, then increases. • This means there is trade-off between expected return and risk (as measured by the standard deviation). • The investor could obtain a higher expected return by putting more of his money into GM; but past a fraction of approximately 0.4, the risk also increases.

Distribution of Portfolio Return

Distribution of Two Random Variables: Joint Probability Approach

SUBS.XLS • A company sells two products, product 1 and 2, that tend to be substitutes for each one another.The company has assessed the joint probability distribution of demand for the two products during the coming months. • This joint distribution appears in the Demand sheet of this file. • The left and top margins of the table show the possible values of demand for the products.

SUBS.XLS -- continued • Demand for product 1 (D1) can range from 100 to 400 (in increments of 100) and demand for product 2(D2) can range from 50-250 (in increments of 50). • Each possible value of D1 can occur for each possible value of D2 with the joint probability given in the table. • Given this joint probability distribution, describe more fully the probabilistic structure of demands for the two products.

Joint Probability Approach • In this example we use an alternative method for specifying probability distribution. • A joint probability distribution, specified by all probabilities of the form p(x, y), indicates that X and Y are related and also how each of X and Y is distributed in its own right. • The joint probability of X and Y determines the marginal distributions of both X and Y, where each marginal distribution is the probability distribution of a single random variable.

Probability Essentials

Probability Essentials

Presentation Transcript

Probability

Essentials

Probability

Probability

Essentials

Probability

Probability

Probability

Essentials

Probability

Probability

PROBABILITY

Probability

Probability

Essentials

Essentials

Mathematical Essentials, Probability Concepts, and Statistical Measures

PROBABILITY ESSENTIALS

Essentials

essentials

Probability: Experimental Probability Vs. Theoretical Probability